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INTRODUCTION 


This  is  the  narrative  portion  of  Final  Reoort  of  ONR  Contract 
No.  N0014-69-A-0423,  NR  042-261.  The  final  technical  reoort  (docu- 
mentation, user  manuals,  displays,  tables)  was  oreoared  and  distributed 
in  six  volumes,,  entitled  Final  Report,  Appendix  Volumes  I to  VI,  in  Jul 
1974.  Since  this  narrative  makes  considerable  use  of  these  appendices, 
they  should  be  regarded  as  an  integral  part  of  this  Final  Report. 

This  summary  report  is,  to  some  extent,  a reflection  of  the 
developments  of  interactive  computation  from  1969  to  1975.  Very 
considerable  systems  orogramming  work  was  reouired  in  the  initial 
stages.  In  1970  we  were  fortunate  to  receive  a copy  of  the  GRAF 
package  from  the  Health  Facility  of  UCLA.  After  some  work, 
especially  to  fit  these  routines  into  our  narrower  scope,  this 
tool  enabled  us  to  obtain  operational  modules  in  1971  and  1972. 

By  that  time  we  had  developed  our  own  monitor  system,  better 
adanted  to  our  needs,  especially  for  statistical  systems.  A 
manipulative  calculator  unit  involving  statistical  distribution 
routines  (central  and  non-central)  became  available  and  was 
integrated  with  several  of  the  later  units  (response  surfaces, 
interactive  OMNITAB).  From  the  view  of  1977  this  may  appear  fairly 
conrion,  but  in  1971  there  were  very  few  systems  having  that  facility. 

The  raoid  changes  of  computer  configurations,  and  availability 
of  cheaper  terminals  that  could  handle  all  but  the  most  elaborate 
disolays  of  the  IBM  2250  graphics  system,  made  our  system  somewhat 
anachronistic  by  1^73,  and  efforts  were  begun  to  adapt  or  prepare 
interactive  units  for  other  configurations.  However,  in  the  context 
of  this  project  we  resisted  from  preoaring  "statistical  packages" 
and,  where  we  needed  no  graphics,  presented  statistical  routines 
as  library  functions,  easily  accessible  in  FORTRAN.  Since  1974,  this 
part  of  our  interactive  software  (see,  esoecially,  Bouver  and  Baromann 
[42])  was  used  much  more  frequently  than  the  graphics  terminal. 


In  1974  we  took  the  2B40  Control  Unit  off  maintenance,  because 
the  teletype  terminals  widely  available  then  had  become  less  exoensive 
to  purchase  than  a one-year  maintenance  agreement  on  the  graphics 
system.  Some  of  the  adaptation  work  was  done  in  1973,  and  is  described 
in  this  report  (Hayward  and  Bargmann  [38]);  this  process  is  still 
in  progress.  As  units  are  being  adapted  and  documented,  credit  will 
be  given  to  this  grant,  in  which  the  orioinal  design  work  was  made. 

The  changes  in  hardware  (360  to  370,  changes  of  disk  drives  and 
core  configurations)  and  environment  (variable  storage(MVS)) , which 
took  place  between  1973  and  1976,  reouired  considerable  reprogramming, 
especially  since  procedures  in  the  newer  configurations  were  not 
readily  adaptable  to  the  older  (1965)  hardware  and  channel  configuration. 
There  were  delays,  often  extending  over  several  months,  but  we  did 
succeed  in  getting  the  interactive  graphics  system  operational 
under  the  latest  configuration,  in  October/November  of  1977.  Except 
where  otherwise  noted,  the  photographs  of  the  video-screen  exhibited  in 
this  reoort  were  made  from  the  latest  version. 

As  anticipated  in  the  original  THEMIS  proposal  of  1969,  the  maior 
nart  of  the  software  develooment  was  done  by  graduate  students,  who 
performed  these  tasks  in  conjunction  with  their  thesis  or  dissertation 
work.  In  fact,  three  of  the  major  systems  and  applications  units 
constituted  Ph.D.  dissertations  devoted  almost  entirely  to  the  development 
of  statistical  software;  ("An  On-Line  Statistical  System  for  Lay 
Usage",  Penn  [12];  "Interactive  OMNITAB  for  Statistical  Usage",  Bingham 
and  Bargmann  [41];  and  "firaphical  Aids  for  Statistical  Computation", 

Bond  and  Bargmann  [43]).  Except  for  the  very  earliest  tasks  (the 
prior  THEMIS  task  had  a broader  spectrum  of  coverage)  all  research 
reports,  theses,  and  dissertations  prepared  under  this  grant  included 
substantial  work  in  interactive  statistical  computation,  to  which 
the  task  was  liminted  since  1970. 

Transportability  has  never  been  a oroblem  for  the  mathematical 
and  statistical  subroutines,  since  all  were  written  in  FORTRAN. 

The  major  difference  in  the  programs  for  CDC  Cyber  and  IBM  is  the 
need  for  double  precision  in  the  latter,  which  is  seldom  reguired  in  the 
former.  For  the  device-specific  graohics  subroutines  the  problem 
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was  quite  difficult.  An  adaptation  was  successfully  made  at  the 
University  of  Arkansas,  in  1972.  The  routines  comoiled  with  the  aid 
of  the  UCLA  GRAF  system  are  not  easily  reconstructed  from  the  source 
programs.  The  load  modules  need  to  be  changed,  directly  (ZAP),  to 
accommodate  changes  in  configuration.  The  version  presently  on  tape 
has  been  shown  to  work  successfully  on  an  IBM  370/158  (MVS)  via  a 
2840  mod  I controller  and  an  IBM  2250  console  (which  need  not  have 
absolute  vector  capability);  see  Chapter  3. 

The  graphics  monitor  system  has  had  considerable  use  in  research 
and  classroom  work.  At  one  time  (1972-73)  it  was  available  24  hours  a 
day  and  often  used  for  oreparation  of  demonstrations,  and  in  thesis 
research.  The  batch  versions  of  some  of  the  larger  units  have  even 
wider  use  inside  and  outside  the  University  of  Georgia.  Since  the 
availability  of  "express"  runs  and  Quick  turnaround  even  of  batch 
jobs,  some  of  the  units  requiring  large  output  (e.g..  Analysis  of 
Covariance,  Hierarchical  multivariate  analysis)  can  be  handled 
oromptly  in  batch  mode,  even  though  repeated  user  interference  is 
required. 

^This  narrative  report  consists  of  five  chapters;  Chapters  1 
and  2 describe  interactive  graphical  units  and  present  examoles  of 
the  use  of  these  units,  with  photographs  of  the  screen  taken  in  the 
Fall  of  1977,  when  the  interpreted  system  was  operational  on  the  latest 
monitor.  Chapter  3 consists  of  a short  description  of  the  graphics 
system,  including  examples  and  considerations  of  transportation  and 
adaptation  to  other  systems.\  For  the  detailed  description  of  these 
orograms  and  theTrlj'ses  tfie  reader  is  referred  to  the  technical 
documentation  in  the  apoendix  volumes  or  THEMIS  reoorts. 

''' — ^Chapter  4 describes  those  tasks  which  supported  the  software 
development,  especially  numerical  analysis  work  and  the  development 
and  testing  of  efficient  and  precise  modules.  Other  statistical 
tasks  performed  under  this  grant  are  described  in  Chapter  5. 

Technical  details,  instructions  to  users  and  programmers, 
important  formulas,  and  tables  and  selected  displays  of  program  uses  are 
contained  in  six  appendix  volumes  to  this  Final  Report,  which  were  “ 


distributed  in  the  Fall  of  1974*^and  to  which  frequent  reference 
has  been  made  in  this  narrative  portion. 


CHAPTER  1 


First  Phase:  Programs  under  the  UCLA  GRAF  MONITOR 
1.1  Interactive  Input-Output  Analysis 

Documentation:  Appendix  C,  appendix  Vol . I,  pp  50-132,  and  THEMIS 
Report  No.  11,  Fortson  [15]. 

After  a call  to  the  program  segment  SLINK  MODEL  the  user  is 
reouired  to  describe  his  netvvork  in  terms  of  flow  of  materials  from 
Input  to  Intermediate  to  Output  products,  and  efficiencies  (Leontieff 
algorithms).  A forecaster  will  use  this  network  by  calling  SLINK 
FORECAST  and  answering  questions  on  demand  of  output,  demand  (if 
any)  of  intermediate  oroducts,  variation  of  demand,  and  dependence 
of  demand  on  general  economic  trend  (a  one-factor  factor  analytic 
model  is  used  for  the  estimation  of  the  covariance  matrices).  Changes 
in  product  flow  can  be  made,  at  this  stage,  with  a light  oen.  Novel 
features  in  this  unit  are: 

Establishment  of  a variance-covariance  matrix  of  demand,  and 
effect  on  the  variance  of  input  requirements,  in  the  Leontieff 
model . 

The  user  is  expected  to  express,  as  a coefficient  of  determination  (in 
percent)  the  relationship  between  sales  of  a product  and  general 
economic  activity.  This  relationship  may  be  negative  (e.q.,  for  spare 
parts).  The  program  constructs  a one- factor  factor  analytical  model 
for  the  variance-covariance  matrix  of  demands.  In  a class  taught 
at  the  Lockheed  program  of  the  Georgia  Institute  of  Technology  (1971), 
some  students  investigated,  by  simulation,  the  effect  on  reouirements 
if  the  underlying  model  has,  in  fact,  two  "common  factors."  Changes  were 
quite  insignificant. 

The  final  demonstration  was  made  by  Mr.  Satish  Mehra  (in  conjunction 
with  his  dissertation  work  in  Management)  in  October,  1977,  to  study 
the  sensitivity  of  requirements  to  estimates  of  the  dependence  of 
demand  on  general  economic  trends.  Sample  displays  were  photographed 
from  the  IBM  2250  screen  and  are  shown  in  Figures  1.1.1  to  1.1.8.  The 
data  were  based  on  an  actual  production  and  inventory  network.  The 
study  showed  that  a wide  range  of  trial  value:  for  the  coefficient 
of  variation  (from  zero  to  20  percent  to  the  empirically  found  53  percent) 
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produced  only  very  small  fluctuations. 

This  unit  was  demonstrated  at  N.  C.  State  College  in  Raleigh, 

N.  C.  in  1970,  at  the  Iowa  State  University  in  Ames,  Iowa,  in  1972,  and 
at  several  national  meetings  of  the  American  Statistical  Association. 

It  was  used  in  our  Information  Systems  courses  (STA  804).  Many 
visitina  orouos  saw  this  unit  in  operation.  It  was  implemented  at 
the  University  of  Arkansas  in  1973. 

An  adaptation  to  non-graphics  terminal  use  is  now  under  development. 
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1.2  Interactive  Quantal  Analysis 


Documentation:  Appendix  H,  Appendix  Vol  II,  pp.  246-281,  and  THEMIS 
Report  No.  16  (Ishee  [211). 

To  perform  this  analysis  commonly  known  as  Bio-Assay  the  user 
connects  to  SLINK  QUANTAL  and  fills  in  the  blanks  shown  in  Fioure  1.2.1. 


After  data  entry  is  complete  the  user  has  the  option  to  ask  for  analysis 
by  Probit,  Loqit,  log-log,  arcsine,  or  Weibull  transformation  with 
shape  parameters  1 to  6.  Figure  1.2.2  appears  on  the  screen  and 
indicates  how  well  the  data  are  aoproximated  by  the  chosen  growth 


curve. 


Figures  1.2.3  and  1.2.4,  are  examples  of  numerical  results. 
Figure  1.2.5  indicates  whether  the  selected  transformation  was 


chosen  properly.  At  this  stage  the  user  has  a variety  of  options  to 
transform  dosages  (e.q.  loqs,  square  roots),  and  to  pre-view  figures 
such  as  1.2.5  before  re-doing  the  analysis  with  the  new  transformations. 

Novel  features:  Really  standard  quantal  analysis,  but 
with  preview  option  to  save  unnecessary  computer  time.  Also 
the  inclusion  of  the  Weibull  transformation  of  degree  3 or 
greater  is  non-standard.  The  ease  with  which  best  transformations 
can  be  found,  by  combination  of  batches  and  scale  changes,  was 
the  reason  for  the  popularity  of  this  unit. 

Until  1974,  when  "Interactive  OMNITAB"  became  operational,  the 
QUANTAL  unit  was  by  far  the  most  freouently  used  module.  It  was  used 
in  studies  of  suicide  trends,  design  of  atomizer  nozzles  in  insecticide 


i 


spray,  studies  on  efficiency  of  family  planning  clinics,  and  many 
others.  A very  simple  interactive  version  exists  (without  graphics) 
on  our  CDC  Cyber  computer.  The  algorithms  are  also  used  in  the 

WEIBUL  module  (see  Section  2.1.  below)  and  are  as  appropriate  in  the  j 

fittinq  of  qrowth  functions  as  are  order  statistics  and,  of  course, 
much  faster. 


Demonstrations,  with  transparencies,  were  shown  at  several  national 
meetings  in  1971-73.  They  were  also  shown  to  staff  members  at  Fort 
Benninq,  Ga.,  and  used  for  data  analysis  of  small  arms  simulation 
studies.  The  unit  was  operational  at  the  University  of  Arkansas,  in  1973. 
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This  unit  vid^  fully  operational  at  the  last  test,  in  October,  1977. 
However,  the  photographs  are  from  an  earlier  run,  since  the  quality 
of  the  more  recent  photoqraphs  is  inferior. 


1.3.  Cluster  Configurations  in  n Dimensions. 

Documentation  Appendix  I,  appendix  vol.  Ill,  pp.  1-71  and  Themis 
Report  No.  17  (Trivedi  [231) 

In  our  recent  test  under  MVS  we  were  unable  to  obtain  photoqraphs 
of  this  unit.  The  reader  is  advised  to  look  at  the  displays  in  Appendix 
Vol.  Ill  and  the  instructions  in  Themis  Report  No.  17,  pp.  68-113. 

The  development  of  this  interactive  graphical  unit,  which  requires 
batch  pre-processing  of  data,  was  oromoted  by  the  desire  to  have  a 
combined  factor-analysis  and  cluster-analysis  algorithm  which  could 
be  used  when  the  strongest  factors  or  clusters  are  trivial.  The 
dilemma  is  best  seen  in  an  example  which  I ran  at  IBM  many  years 
ago.  The  factor  analysis  of  symptoms  of  diseases  was  overshadowed 
by  one  factor,  consisting  of  the  trivial  syndrome  fever-pain-and-chills, 
and  the  cluster  analysis  of  patients  identified  just  clusters  of  very 
ill,  slightly  ill,  and  healthy  patients,  without  differerentiation 
of  diseases.  A simple  structure  rotation  within  clusters  was  attempted, 
to  see  whether  the  patients  fell  into  an  overdetermined  subspace 
only  in  the  clusters  of  ill  people,  and  not  in  others.  Many  workers 
in  multivariate  analysis  would  like  to  "see"  their  configurations 
as  projections  of  p-dimensional  hyper-ellipsoids  into  ellinses  in 
2-dimensional  subspaces. 

The  data  which  were  generated  were  supposed  to  exhibit  such 
secondary  structure  - alignment  into  overdetermined  subspaces  ("simple 
structure  planes")  in  only  some  of  the  clusters.  When  we  knew  the 
location  of  these  subspaces,  and  chose  projection  accordingly,  they 
were  revealed  at  once  (page  86  and  88  of  Themis  Reoort  No.  17,  page  25  of 
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Appendix  Volume  III).  In  much  the  same  way  as  factor  analysts  did 
their  rotations  by  graphical  methods,  for  all  pairs  of  factors, 
before  the  advent  of  computers,  we  developed  this  module  to  display 
such  projections  instantly  on  a screen  - not  just  pairs  of  orthogonal 
axes  but  any  projection  on  which  we  might  suspect  to  find  significant 
overdetermination. 

When  one  looks  at  one  projection  of  a four-dimensional  ellipsoid 
in  2-space  (see  oaoe  104  of  Themis  Report  No.  17)  one  really  has 
no  inkling  of  the  underlying  configuration  when  viewed  from  a different 
projection  (see  page  105). 

There  has  been  much  experimentation  with  this  unit,  and  it  was 
displayed  and  discussed  in  several  symposia  (here  at  the  University 
of  Georgia,  and  at  the  North  Carolina  State  University  at  Raleigh), 
but  it  appears  that  identification  of  overdetermined  subspaces  in 
n-dimensions  is  still  as  much  of  an  art  as  it  has  always  been. 

A very  elaborate  pre-processing  program,  described  on  pages  68-71 
of  Themis  Report  No.  17,  gives  the  user  several  pointers  where  to 
look  for  overdetermined  subspaces.  The  display  is  then  very  simple; 
the  user  merely  calls  JLINK  ELLIPSF  and  follows  the  simple  instructions 
(the  reader  is  referred  to  pages  7-25  of  Final  Report,  Appendix  Vol. 
III).  The  unit  has  been  used  in  classroom  instruction  in  a course 
of  multivariate  analysis.  The  batch  pre-processor  has  been  used, 
by  itself,  for  several  studies,  especially  in  Food  Science. 

We  did  not  succeed  in  making  this  unit  operational  on  the  370/158 
under  MVS  - it  worked  well  under  earlier  releases  and  at  the  University 
of  Arkansas.  It  is  expected  that  changes  will  have  to  be  made  on  the 
device-dependent  segments  (e.g.,  last  buffer  address)  which  need  to 
be  set  differently  in  different  installations.  The  corresponding  code 
is  in  the  section  relating  to  the  GRAF  (UCLA)  monitor. 

1.4  Other  Units  Under  UCLA  GRAF  Program. 

Queuing  Analysis  for  Branched  Processes 

Documentation:  Appendix  0,  appendix  Vol.  I,  pp.  132  to  178,  and  THEMIS 
Report  No.  12  (Knybel  (161). 

This  was  an  interactive  program  to  obtain  distributions  of 
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queues  (typically  in  a manufacturing  process)  under  an  M/M/s  queuing 
discipline  (exponential  arrival,  exponential  departure,  s servers 
in  a channel).  This  unit  was  used  in  classroom  instruction  prior 
to  1974,  when  it  was  replaced  by  the  more  versatile  PRODFLOW  unit 
(see  Section  2.5). 

A Conversational  Unit  for  Hierarchical  Discriminant  Analysis 
Documentation:  Appendix  E,  Appendix  Vol.  I,  pp.  178-227,  and  THEMIS 
Report  No.  13  (Schwartz  [17]). 

First  attempt  to  obtain  an  interactive  version  of  a multivariate 
analysis  program  for  irregular  data,  with  plotting  facilities  for  a 
2-way  analysis  of  variance.  Was  replaced  one  year  later  by  the  more 
general  "Interactive  Multivariate  Data  Analysis  Program"  (see  Section 
2.4). 


24 


« 


CHAPTER  2 

Second  Phase:  Proqram\ under  the  University  of  Georgia  Monitor 
System  (GMS,  COMAP,  COMFORT) 

For  narrative  description  of  the  system,  adaptation,  and  trans- 
portation, see  Chapter  3. 

2.1  A Conversational  Unit  for  Fitting  Oata  to  the  Weibull  Distribution. 
Documentation:  Anpendix  N,  Appendix  Volume  IV,  pp.  63-101,  and 
THEMIS  Report  No.  23.  (Chanq-Wu  Yen  and  J.  E.  Norman  [30]) 

This  unit  is  an  example  of  the  value  of  a simple  qraphics 
unit  to  the  teachinq  of  statistics  and  in  research.  It  is  one  of 
the  easiest  to  operate  (User  just  calls  SLINK  V^EIPUL  and  answers 
questions  as  shown  in  Figures  2.1.1  and  2.1.2.).  Three  methods  of 
estimation  were  employed  for  this  rather  wide  class  of  growth  curves: 

A modification  of  the  quantal  methods  used  in  bio-assay,  which  was 
taken  from  the  QUANTA!,  unit  (Section  1.2),  and  two  methods  based  on 
order  statistics.  For  large  samoles  (n  = 100)  it  makes  no  difference 
which  method  is  employed  (see  Figure  2.1.3  - the  bio-assay  method, 
and  2.1.4,  all  three  methods). 

In  the  experimental  run  of  the  latest  version  (under  MVS, 

October  77)  we  employed  smaller  sample  sizes  (n  = 20,  bioassay 
method.  Fig.  2.1.5)  and  very  small  sample  sizes  (n  = 10,  Fig.  2.1.6, 
all  methods)  including  an  example  of  the  J-shaped  members  of  the 
Weibull  class  (Fig.  2.1.8,  n = 10,  bio-assay  method).  The  striking 
fact  of  the  small  sample  analyses  is  that  the  numerical  estimates 
are  very  different  for  different  methods,  and  quite  poor  in  relation 
to  the  parent  from  which  sample  arose  (scale  parameter  estimate  4.33 
from  a population  value  of  2.0;  shaoe  parameter  0.41  from  a parent 
value  of  .5,  Figure  2.1.7)  but  the  quality  of  fit  is  Quite  impressive, 
even  for  these  small  data  (Fig.  2.1.8)  and,  though  different  growth 
models  are  fitted  to  the  same  data  by  different  methods  (Fiq.  2.1.6) 
it  is  not  easy  to  decide  which  one  yields  superior  results;  the  main 
difference  appears  to  lie  in  the  relative  Importance  given  to  the  fit 
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of  the  growth  model  in  the  "tails,"  a well  known  characteristic 
in  the  analysis  of  growth  or  learning  curves. 

Novel  features  in  this  unit  are: 

The  use  of  a very  efficient  modified  bio-assay  method  for 
fitting  a complex  class  of  growth  models,  and  the  comparison 
with  the  more  generally  used  order  statistics  methods.  A 
facility  to  obtain  fits  to  data  by  different  methods  and 
making  a visual  comparison,  rather  than  relyino  on  numerical 
values  of  scale  and  shape  parameters;  freouently,  widely 
differing  oarameter  estimates  yield  eoually  good  fits. 

This  unit  has  been  used  in  classroom  instruction  (courses  in 
Distribution  Theory,  and  Non-linear  Statistical  Analysis)  and  for 
illustration  and  verification  of  dissertation  work  on  the  tooic 
of  estimation  of  parameters  of  the  Weibull  class.  The  unit  has 
been  shown  to  be  operational  and  transportable  (under  MVS)  IBM 
370/158. 

2.2  Conversational  Unit  for  Spline  Function  Construction. 

Documentation:  Appendix  0,  Appendix  vol . IV,  pp.  102-137,  and 
Themis  Report  No.  24  (J.  S.  Scott  and  J.  E.  Norman  [31]). 

Data  filtering  and  smoothing  is  an  important  preparatory  step 
in  the  analysis  of  time  series  or  fitting  of  densities  fo  mixed 
statistical  distributions.  In  the  first  step,  some  authors  begin 
the  process*  by  fitting  a cubic  spline  to  the  observed  data,  and  then 
apply  various  techniques  of  filtering  (or  moving  averages)  to  obtain 
a smoother  (monotonic,  unimodal,  bimodal)  fit. 

A graphical  unit  seems  especially  useful  to  facilitate  this  orocess. 
In  the  present  unit,  after  calling  ^LINK  SPLINE,  the  user  receives 
instruction. to  define  coordinates  and  number  of  points,  and  then 
to  enter  the  points  (Fig.  2.2.1  and  2.2.2).  The  program  fits  a cubic 
spline  through  these  points  (Fig.  2.2.3),  then  enables  the  user  to 
change  ordinates  of  any  point  or  points  by  the  use  of  a light  oen. 

In  Fig  2.2.4  the  light  pen  was  placed  on  the  asterisk  near  the  center 
of  the  field,  and  a series  of  dots  was  erected  around  this  point. 


*e.q.,  G.  Wahba,  "Smoothing,"  Symp.  Aopl.  Stat. , Dayton,  1976  (proceedings 
in  print). 


jX 

r 

•J 

V -• 

o 

>.  u*  vu*  X 

Of 

V 

«» V *>«-» 

*9 

r 

O 

u*0 

/ jX  O ^ 

or — 

f V 

—tAo  »-or 

o ^ 

X « u*  j o 

OU 

»-  orJU'O 

-•o 

u*  —or  or 

?U' 

o 

u'Xor'*— o 

aJ^ 

0(/l>OOV» 
— 1/>  V U' 


JU 

OU*  * oor 

o »- 

— 

?ororu'00 

o 

to.  V 

V ? 

— o u*r  V 

vr 

or 

V 

»—  u* 

CD3 

o 

j 

o 

»-  • 

h. 

JO 

j 

oo 

— — 

►- 

J 

o 

vor 

OV  » 

orr 

^o 

b 

ro  — or 

►- 

U* 

o 

o?r  u*  • 

vor 

r 

or 

CO 

o ^►-orru* 

►- V 

r 

ir*v 

o>-  u»*-^ 

U'O 

— o 

u 

?u* 

v-u.*-  »o? 

or  or 

«« 

or 

ov 

Otf  bt  — 

a 

u* 

o 

— 

•i9  T ■^^- 

V 

or  or 

o 

»-or 

oe'z’z  ^«'»r 

■« 

u* 

or 

IAU« 

u*— o • o 

rr 

~u* 

or 

o 

»o— zor  VO 

— • 

jor 

o— 

V»  JVUOU* 

3V» 

u* 

U' 

aof 

^OUOa'OaT  j 

OV» 

r 

»r» 

tt 

^ r jorv  J 

vu* 

^3 

— 

JU' 

ao  ?— 

or 

to.O 

U' 

J 

oru^riAorov 

ra 

V 

J 

o 

or^ 

•3  JO  u»  — 

u* 

o 

lA 

Ui»- 

o— ou»»-»-r 

orv 

orr 

> 

vr  rro^ 

<5«D 

«U' 

* u* 
r 

v»  *- 

o o 

~ ? 
»-  ~ 

o Of 
"3  U» 

or  •- 
•-  ? 
V*  u* 


■kort^u'Zor 
o O • OO 
U'007u'to.0 
►-?>OQf  Of 

u'«  ra 
-I  ooor « 
o • uoooru* 
r V.  jv  i»r 
OU*  -i  u*o»- 


rv^oo 

U*(A 

o « r 

or 

oru' 

— ar  r 

VO 

3or 

btojotot 

r 

v»a 

« u*  • 

— o 

uwoor  — 

Ovk 

u« 

^ ►-ft 

o 

CO 

T r u*  ^ . u* 
•-  v>  ^ V ^ 
r j 

U'i90  J?0  ? 
X :r  >-o  ou'O 

^ af  •— 

t^ur-'fir  »- 
u'c^Ott'o 
u**-  o O? 
u*  u u>a  oo 

o •««ivov.|t 


38 


Pointing  the  light  pen  at  any  of  the  dots,  the  user  can  produce  an 
0 at  this  point  indicating  the  y-value  through  which  the  smoothed 
spline  would  be  fitted.  Fig.  2.2.5  shows  how  the  smoothed  spline  is 
passed  through  the  O's  and  unchanged  asterisks.  Another  illustration 
is  in  Fig.  2.2.8  and  2.2.7. 

After  the  decision  has  been  made  (after  as  many  changes  as 
desired),  the  user  obtains  the  equation  of  the  last  spline  (in  the 
usual  truncated  polynomial  notation;  the  small  illegible  mark 
before  the  **3  in  each  row  of  Fio.  2.2.8  is  a +),  and  may  use  this  for 
purposes  of  interpolation  or  estimation  of  trends,  or  parameters 
of  mixed  distributions. 

Novel  features:  None  really;  merely  a fast  routine  to  draw 

splines  throught  points  relocated  by  the  use  of  a light  pen. 

The  illustration  2.2.1  to  2.2.8  were  photographs  of  the  screen, 
obtained  in  October,  1977,  with  the  IBM  2250/2840  hooked  up  to  a 370/158 
under  MVS.  It  also  operated  well  after  transportation  of  all  data  sets. 

Use  has  been  made  of  this  unit  in  some  courses,  and  for  the 
smoothing  of  data  prior  to  performing  a compartmental  analysis  (estima- 
tion of  coefficients  in  a system  of  differential  equations)  in  a 
dissertation  study  of  tracers. 

2. 3.  An  Interactive  Analysis-of-Covariance  Unit 

Documentation:  Appendix  P,  Appendix  Vol . IV,  pp.  138-225,  and  THEMIS 
Report  No.  25  (M.  E.  Nash  and  R.  E.  Bargmann  (33]). 

Programs  performing  Analysis  of  Variance,  of  a two-way  classification 
model,  even  with  missing  and  unbalanced  cells,  have  been  in  more  than 
adeouate  supply  for  two  decades.  On  the  other  hand,  analysis-of- 
covariance  units,  except  for  the  very  simplest  cases,  are  in  shorter 
supply.  One  reason  is  the  problem  of  what  to  regard  as  classificatory 
variables  (the  number  of  rows  and  columns  in  a design)  and  what  to 
regard  as  concomitant  variables  (design  variables  known  without  errror, 
but  with  ordinal  scale  characteristics).  Most  packages  provide  a 
"general  linear  analysis"  feature,  to  deal  with  the  more  complex 
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situations  of  analysis  of  covariance,  hut  fail  to  warn  the  user  that, 
where  variables  are  cateqorized  or  "nominal",  widely  different 
results  can  be  obtained  by  a channp  of  assumotions  on  the  nature  of 
main  effects  and  interactions,  in  the  unbalanced  case. 

An  interactive  unit  to  nerform  an  analysis  of  covariance  involving 
unbalanced  designs,  with  graphical  display  and  variable  selection 
features,  seemed  to  serve  a useful  purpose.  This  unit  was  imolemented 
in  1973,  on  the  Rraohics  terminal  system  (later  adapted  to  teletype  use, 
see  Section  3.3).  When  the  user  calls  SLINK  ANACOV,  he  receives 
rather  extensive  instructions  on  the  screen,  first  to  set  up  his  data 
for  an  analysis  of  variance  (without  regression  variables  present). 

The  user  determines  the  number  of  rows,  and  columns,  indicating  how  he 
wishes  to  encode  the  levels  of  his  factors,  and  then  proceeds  to  enter 
the  data  (Fig.  2.3.2).  There  are  facilities  for  editing  and  augmenting 
data.  Up  to  9 concomitant  variables  can  be  added;  if  there  are  fewer, 
the  last  variable  can  be  expanded  into  a polynomial  of  higher  degree, 
or  the  last  two  can  be  exoanded  to  include  mixed  oolynomials.  Figure 
2.3.2  is  an  example  of  just  two  concomitant  variables  (named  RAIN  and  SUN) 
with  the  program  filling  in  the  values  for  all  quadratic  and  cubic 
tf-mns  in  these  variables.  Transformations  on  the  random  variables, 
and  on  the  concomitant  variables,  can  then  be  indicated  (Fig.  2.3.3). 

The  program  returns  with  a standard  analysis  of  variance:  Tables 
of  cell  means,  adjusted  means  for  the  levels  of  each  factor  (here 
called  LOCA  and  FFRT)  and  the  usual  analysis-of-variance  table  of 
mean  souare  and  F ratios  (Fig.  2.3.4  and  2.3.5).  At  this  stage  the 
user  is  encouraged  to  edit  his  data,  he  may  even  plot  the  data  from 
selected  cells,  rows,  or  columns  (the  random  variable  vs.  each  selected 
concomitant  variable)  in  order  to  detect  outliers. 

Up  to  this  stage,  data  could  have  been  entered  in  batch  mode  - in 
fact,  there  is  the  option  to  do  so.  The  purely  conversational  oart 
begins  here  with  the  names  of  the  concomitant  variables  appearing,  one 
by  one,  and  the  user  deciding,  by  pressing  Program  Function  Key  1 or 
2,  whether  he  wishes  to  retain  or  exclude  that  variable  in  this  pass. 
Figure  2.3.6  indicates  that  here  the  variable  RAIN,  SUN,  and  the  square 
and  cube  of  the  first  were  to  be  retained.  In  a later  pass,  only 
RAIN,  its  square  and  cube  were  selected. 
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The  oronram  returns  the  matrices  of  sums  of  squares  and  products 
for  error,  the  correlation  based  on  them,  and  the  regression  coefficients, 
for  the  selected  variables,  (Fig.  2.3.7).  Plots  of  all  data  (Fig. 

2.3.8)  or  selected  rows  or  columns  (Fig.  2.3.9)  can  be  viewed,  with 
the  regression  polynomial  displayed.  Notice  that  the  polynomial 
applies  to  the  data  "within  cells",  and  total  data,  or  total  rows, 
would  be  affected  by  main  effects  in  addition  to  regression  on  con- 
comitant variables.  An  analysis-of-covariance  table  (Fig.  2.3.10)  can 
be  obtained,  at  any  time,  and  the  user  can  decide  if  another  pass 
with  another  selection  of  concomitant  variables  and/or  data  editing 
are  desired. 

Novel  features:  Analysis  of  covariance  with  unbalanced  designs, 
with  plotting  facilities,  variable  selection,  and  expansion  of 
variables  into  polynomials. 

This  unit,  and  its  later  adaptation  to  teletype  terminals  (attached 
to  CDC  Cyber)  has  been  in  freouent  use  in  the  classroom.  It  was 
demonstrated  to  visiting  groups,  reported  on  in  meetings  of  the  American 
Statistical  Association,  and  demonstrated  at  locations  away  from  the 
University  of  Georgia,  over  remote  job  entry  or  telephone  lines 
(Armstrong  College  and  West  Georgia  College,  1975).  It  was  shown 
to  be  operational  and  transportable  under  MVS,  in  October  and  November, 
1977;  the  attached  figures  are  ohotographs  of  the  screen  from  these  trials. 
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2.4.  An  Interactive  Multivariate  Data  Analysis  Program. 

Documentation:  Aopendix  J,  Appendix  vol.  Ill,  pp.  72-209,  and  THEMIS 
Report  No.  18  (A.  Ballengee  and  R.  E.  Barnmann  [2^]) 

The  structure  of  this  unit  is  so  similar  to  the  preceding  (Analysis 
of  Covariance)  unit  that  a separate  set  of  photoqraphs  was  not  included 
here;  a very  detailed  sample  of  displays  is  to  be  found  in  Appendix  J. 

The  unit  was  fully  ooerational  in  our  October/ Nov ember  1977  trials. 

The  characterististics  of  experimental  design  are  the  same  in 
this  unit  and  the  preceding  one:  A one-way  classification  or  two-way 
classification  design,  with  possibly  missing  cells  and  unbalanced 
data.  The  user  may  encode  the  levels  of  his  factors,  and  may  specify 
transformation  of  data.  Runs  can  be  reoeated  as  often  as  desired 
with  changed  transformations.  There  is  a facility  for  data-editing, 
and  data  may  be  entered  through  batch  mode. 

After  data  entry  is  complete,  the  program  performs  an  analysis 
of  variance  on  each  response  variable  separately.  Since  there  is  no 
distinction  between  variables  into  random  and  concomitant  variables 
here,  the  user  may  stop  the  univariate  analysis  at  any  time.  Usually, 
however,  the  user  would  let  the  analysis  of  variance  be  completed 
before  proceeding  to  variable  selection  for  the  multivariate  pass. 

He  may  inspect  tables  of  cell  means  and  standard  deviations  within 
cells,  adjusted  means  for  rows  and  columns  (in  the  case  of  imbalance) 
and  the  usual  analysis  of  variance  tables. 

The  multivariate  pass  begins  with  variable  selection  - the  user 
may  decide  to  include  any  variable  whose  name  appears  on  the  screen 
(by  depressing  Program  Function  Key  1)  or  ignore  it  (by  depressing 
Program  Function  Key  2).  Such  decision  is  usually  based  on  the  F-values 
observed  in  the  univariate  analysis  (response  variables  which  have  a 
very  low  F value  for  subtotal  would  not  be  included  in  the  multivariate 
run).  When  the  multivariate  run  is  complete  the  user  may  inspect  the 
following  results: 

(a)  Matrix  of  sums  of  souares  and  products  based  on  error; 

(b)  Correlation  matrix  based  on  it; 
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(c)  Likelihood  - ratio  test  statistic  for  subtotal  and  attained 

-Q 

probability  level  (recorded  as  0,  if  it  is  < 10  ); 

(d)  Weights  of  the  linear  discriminant  function  (linear  composite 
among  response  variables)  which  maximizes  differences  between 
cell  effects;  the  largest  characteristic  root  of  H(H+E)"\ 
with  parameters  ready  for  entry  into  Roy-Heck  charts; 

(e)  Correlation  of  each  response  variable  vs.  the  best  discriminator. 
Steps  (c),  (d),  and  (e)  are  repeated  for  "Interaction",  Row 
Effects,  and  Column  Effects,  with  a modificaiton  if  the  degrees 
of  freedom  for  a given  hypothesis  is  one  (e.g.,  only  two  levels 
of  rows  or  two  levels  of  column  effects).  In  this  case,  steps 
(c)  and  (d)  are  omitted,  and  only  the  F statistic  (based  on 

p 

Hotelling's  T')  is  reported  with  appropriate  degrees  of  freedom. 

There  is  a plotting  facility  (scatter  plots  for  each  selected  pair 
of  variables)  very  similar  to  the  analysis-of-covariance  unit  (all 
data,  a given  row,  column,  or  cell);  of  course  there  is  no  "regression 
function"  plot,  as  this  would  be  meaningless  in  pairs  of  random  variables. 

The  most  freguent  use  of  this  unit  has  been  in  multiple-stage 
analysis; 

(a)  Separate  variables  into  non-discriminators  and  discriminators  on 

the  basis  of  theF value  in  univariate  analysis;  (b)  separate  the 

discriminators  into  classes  - those  that  correlate  highly  (in  absolute 

value)  with  the  best  discriminant  function  vs.  those  that  do  not; 

(c)  select  the  latter  for  further  discriminant  analysis,  especially 

if  a factor  has  many  levels,  and  for  the  analysis  of  all  treatment 

effects  (Subtotals).  This  process  can  be  done,  separately,  for 

all  effects,  row  effects,  and  column  effects. 

Novel  features:  Hierarchical  discriminant  analysis;  reporting 
of  likelihood  ratio  statistics  (useful  for  testing)  and  union- 
intersection  statistics  (useful  for  confidence  regions  and 
variable  selection) 

Prior  to  its  availability  as  an  interactive  module,  this  program 
was  available  in  batch  mode  (MUDAID)  on  IBM  and  CDC  computers.  Since  the 
number  of  response  variables  is  often  very  large,  the  batch  mode  would 


usually  precede  the  conversational  analysis.  This  unit  was  demonstrated 
(with  transparencies)  at  Ames,  Iowa,  and  in  several  meetings  of  the 
American  Statistical  Association.  It  has  been  used  in  the  classroom 
(Multivariate  Analysis)  and  for  research  studies,  especially  in 
educational  research. 

With  the  availability  of  very  rapid  turnaround  time  in  batch 
mode  in  recent  years,  and  because  of  the  voluminous  output  of  such 
multivariate  analyses,  the  interactive  version  has  had  less  use 
since  1975,  with  the  very  fast  batch  mode  being  widely  used  in  and 
outside  of  the  University  of  Georgia. 

2.5.  Product  Flow  Analysis 

Documentation:  Appendix  R,  Appendix  vol . V,  pp.  2-120,  and  THEMIS 
Report  No.  27  (R.  L.  Wood  [35]). 

One  of  the  popular  management-information  systems  is  the  oroduct- 
flow  system,  where  a model  describes  the  flow  of  materials  from  one 
station  to  the  next;  at  each  station,  the  product  is  processed  by 
"servers"  (usually  machines  or  operators).  A good  first  approximation 
to  the  Queuing  in  such  a production  line  is  the  M/M/s  gueue  (exponential 
arrival,  exponential  departure,  s servers)  possibly  with  additional 
fixed  delays  at  each  station.  Study  of  such  product  flow  systems  in 
an  interactive  mode  are  especially  attractive  to  determine  which  Queues 
will  "explode"  if  servers  in  some  stations  are  eliminated,  and  how 
many  servers  will  be  needed  in  each  station  if  the  process  is  to  be 
scaled  up  to  meet  a specified  demand (optimization  of  servers). 

When  the  user  calls  SLINK  PRODFLOW  he  receives  instructions  to 
declare  options  (Fig.  2.5.1 ). Typically,  Code  2 in  Option  A would 
be  chosen  when  a production  line  has  been  extensively  studied  in  the 
pilot-plant  stage  and  is  ready  for  scale-up  to  production  stage  to 
meet  a given  demand.  In  Option  B,  certain  service  rate  recommendations 
can  also  be  sought  - it  is,  of  course,  up  to  the  production  engineer  to 
decide  if  such  service  rates  are  feasible.  Option  C determines  display 
modes  (whether  the  user  wishes  to  have  detailed  reports  on  the  orobability 
of  the  length  of  each  Queue,  or  merely  summary  information  of  each  station); 
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option  D determines  the  handling  of  exploding  queues  - whether  a certain 
percentage  of  "waste"  is  specified,  or  whether  a maximum  queue  length  is 
to  be  stated. 

Figure  2.5.2  is  an  example  of  the  echo  to  a user-supplied  set  of 

options  - this  run  is  typical  of  a first  simulation  of  the  line  (no  demand 

data,  no  optimization,  all  information  user-supplied).  Notice  that  for 

Station  4,  a "tolerable"  waste  percentage  must  be  specified,  for  when 

Station  3 works  at  full  capacity  (4  servers,  each  at  a rate  of  2.5  units 

per  hour,  with  a yield  of  90  percent)  it  supplies  station  4 at  a rate  of 

9 units  per  hour,  which  is  exactly  the  maximum  rate  which  Station  4 can 

handle  (birth  rate  = death  rate).  Thus  the  queue  can  explode. 

Figure  2.5.3  shows  the  summary  information  of  the  initial  station 

which,  by  definition,  works  at  full  capacity.  An  example  of  a very 

detailed  report  on  queue-length  distributions  is  presented  for  Station 

5,  in  Figure  2.5.4.  Where  (toward  the  end)  probabilities  of  a given 

queue  length  (queue  = numbers  of  units  being  served  plus  number  of  units 

awaiting  service)  are  less  than  .01,  queue  lengths  (26-29,  30  and 

greater)  are  lumped  together.  Figure  2.5.5  shows  the  same  information 

in  condensed  pictorial  form,  and  is  usually  all  that  is  required  in 

the  day-to-day  study  of  the  simulated  process  line. 

Figure  2.5.6  shows  at  which  rate  final  product  can  be  expected  to 

arrive.  Figure  2.5.7  is  the  first  frame  of  an  optimization  run,  in 

which  the  number  of  servers  in  each  station  is  to  be  determined,  so  that 

a demand  rate  of  75  units  can  be  met.  It  shows  that  41  servers  would  be 

needed  in  Station  1.  Where  a pilot  line  works  in  a nearly  optimal 

capacity,  this  is,  of  course,  easy;  but  where  several  critical  points 

occur  in  a system,  the  very  simple  queuing  model  can  pinpoint  the  optimum 

number  of  servers  more  appropriately. 

Novel  features:  Numerical  and  pictorial  display  of  queue  buildup 
in  each  station  of  a production-flow  process.  Application  of 
queuing  theory  to  prediction  of  service  capacity  to  meet  a speci- 
fied demand  rate  of  final  product. 
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This  unit  was  used  in  classroom  instruction  (Course  on  Information 
Systems,  STA  804)  and,  in  thesis  research,  to  study  estimation  procedures 
in  catenary  compartmental  systems.  For  the  transient  case  a simulation 
option  is  available  (since  evaluation  of  convolutions  of  modified 
Bessel  functions,  needed  for  exact  determination  of  probabilities,  is 
very  time-consuming).  Adaptation  of  this,  and  most  other,  phases  of 
this  feature  to  a non-graphics  system  is,  of  course,  very  simple,  but 
has  not  yet  been  done  because  of  lack  of  demand  at  the  University  of 
Georgia. 

2.6  Interactive  OMNITAB 

Documentation:  Appendix  V,  appendix  vol.  VI,  pp.  2-144,  and  THEMIS 
Report  No.  31,  (Bingham  and  Bargmann  [4]) 

The  central  feature  of  OMNITAB  (National  Bureau  of  Standards)  is 
a worksheet  in  which  computations  take  place.  Most  of  the  commands  of 
OMNITAB  are  column-oriented  (operation  performed  on  a specified  column) 
or  matrix  manipulative,  with  matrices  described  by  locating  their 
beginning  coordinates  on  the  worksheet.  With  our  ability  to  display 
large  data  matrices  instantly,  on  a graphics  terminal,  it  was  appropriate 
to  attempt  implementation  of  an  interactive  version.  The  user  of  our 
interactive  version  needs  to  be  familiar  with  the  basic  OMNITAB  commands, 
and  the  positioning  and  meaning  of  operands.  However,  these  are  so  easy 
to  learn  that,  in  our  frequent  application  of  this  unit  in  the  classroom, 
it  took  students  only  a few  minutes  to  familiarize  themselves  with  the 
most  important  commands  (including  matrix  operations  and  statistical 
routines)  to  execute  quite  difficult  problems. 

In  our  interactive  version  we  added,  to  the  standard  OMNITAB 
commands,  the  statistical  distribution  functions 

YORMX,  YORMP,  YORMZ,  BETAX,  BETAP,  BETAZ,  GAMX,  GAMP,  GAMZ,  TTX, 

TTP,  TTZ,  CHIX,  CHIP,  CHIZ,  FFX,  FFP,  FFZ. 

By  using  a modular  facility  which  enables  a programmer  to  add  new 
commands  to  our  system  (described  in  Chapter  IV,  pp.  127  to  139  of 
THEMIS  Report  No.  31),  we  also  added  the  matrix  commands  MTRI,  MTRIN, 
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MCRROTS,  MCVECT.  The  column-oriented  distribution  commands  evaluate, 

for  the  given  column,  the  normal  distribution,  beta  distribution,  gamma 

2 

distribution,  t-distribution,  ^ -distribution,  or  F distribution;  if 
the  command  ends  in  X,  the  probability  is  found,  given  an  abscissa;  if 
it  ends  in  P,  the  abscissa  ("percentage  point")  corresponding  to  an 
input  probability  is  found,  and  if  it  ends  in  Z the  ordinate  for  a given 
abscissa  is  found.  As  in  all  other  column-oriented  OMNITAB  commands, 
the  arguments  can  be  fixed  numbers  (entered  in  real  notation  with  decimal 
point)  or  column  numbers  (entered  as  integers). 

The  matrix  commands  MTRI  obtains  a lower  triangular  matrix  T from 
a Symmetric,  positive-definite  matrix  Q,  such  that  TT*  = Q;  if  Q is 
singular,  positive-semidefinite,  the  matrix  T will  be  rectangular  (n  by  r, 
where  n is  the  order,  r the  rank  of  Q).  MTRIN  obtains  the  inverse  of  T, 
or  an  inverse  from  the  left  (of  order  r by  n)  if  Q is  singular.  The 
matrix  commands  MCROOT  and  MCVECT  obtain  characteristic  roots  and  vectors 
of  Symmetric  matrices,  a feature  espeqpally  useful  for  multivariate 
statistical  analysis.  A list  of  OMNITAB  commands  available  on  our  inter- 
active system  is  displayed  in  Figure  2.6.1.  This  display  is  available 
to  the  user  on  request,  by  depressing  Program  Function  Key  2.  The  unit 
was  fully  operational  and  transportable,  on  370/158,  under  MVS.  The 
attached  figures  are  photographs  of  the  screen. 

Figure  2.6.2  has  the  record  of  commands  used  in  a session  performed 
during  our  October/November  1977  trials;  such  a record  is  available  to 
the  user,  at  any  time,  by  pressing  Program  Function  Key  2;  he  can  also 
see  a list  of  all  OMNITAB  commands  currently  on  the  system  by  depressing 
Program  Function  Key  3. 

The  central  feature  of  our  interactive  version  is  the  ability  to 
display  portions  of  the  worksheet  instantly,  after  each  operation. 

Twelve  lighted  Program  Function  Keys  are  arranged  in  geometric  analogy 
to  the  worksheet,  each  causing  display  of  a 40  by  5 array  (the  total 
worksheet  in  our  interactive  mode  is  80  by  30).  (See  Fig.  2.6.2  to  2.6.5) 

The  session  is  initiated  by  a call  to  $LINK  OMTAB.  A discussion 
of  the  enclosed  illustration  may  help  the  reader  to  understand  the  oper- 
ation of  our  version;  ERASE  is  mandatory,  since  all  our  programs  are 
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overlayed  Into  a small  segment  of  core,  SET  1 enters  twelve  probabilities 
into  Worksheet  Column  1.  Three  values  were  replaced  with  the  READ  1 
instruction  (the  echo  following  the  READ  command  is  ROW  1;  if  the  user 
wishes  to  start  elsewhere  he  types  ROWX,  and  the  program  counts  entries 
from  this  point  on),  (see  detailed  description  on  pp.  1-16  of  THEMIS 
Report  No.  31).  Depressing  Program  Function  Key  10  (the  key  corre- 
sponding to  the  top-left  portion  of  the  worksheet)  the  user  would  see 
only  the  first  column  of  Worksheet  Part  1,  shown  in  Figure  2.6.3. 

The  command  YORMP  1 2 in  Fig.  2.6.2  obtains  the  percentage  points 

under  the  normal  distribution  for  the  values  in  Col.  1 and  stores  them 

into  Col.  2.  The  next  command  CHIP  1 7.  3 uses,  again,  the  probability 

values  of  column  1 as  input,  the  constant  7.  as  the  degree  of  freedom 

(a  real  number,  hence  not  a column  number)  and  places  percentage  points, 

2 

under  the  x ■ distribution,  with  7 degrees  of  freedom,  into  column  3. 

TTP  1 27.  4 obtains,  in  col.  4,  the  percentage  points  corresponding  to 
col.  1,  under  the  Student  t distribution,  with  27  degrees  of  freedom. 

FFP  1 7.  27.  5 (the  first  FFP  call  was  erroneous,  and  the  program  gave 
the  option  to  press  Program  Function  Key  2 to  ignore  it  - see  THEMIS 
report  No.  31)  produces,  in  col.  5,  the  percentage  points  under  the 
F-distribution,  corresponding  to  the  values  in  col.  1,  with  7 and  27  df. 
Figure  2.6.3  is  the  output  of  all  these  operations  - typically,  this 
page  of  the  worksheet  was  called  after  each  command. 

The  remainder  of  this  demonstration  is  an  example  of  the  kind  of 
operations  used  frequently  in  classes  of  statistical  methods  and  multi- 
variate analysis  - with  the  facility  to  see,  instantly,  the  results  of 
each  matrix  operation.  A matrix  of  order  21  by  5 has  been  stored  into 
columns  6 to  10.  The  OMNITAB  command  M(X'X)  1 6 21  5 23  6 takes  this 
matrix  ("the  matrix  starting  at  coordinate  point  1,  6 of  order  21  by  5"), 
performs  the  X'X  multiplication,  and  stores  the  result  into  a field 
beginning  in  row  23,  col.  6.  MCVECT  (an  operation  we  added  to  our  sys- 
tem) obtains  the  characteristic  vectors  of  the  4 by  4 submatrix  starting 
at  (23,  6),  and  places  them  into  the  field  beginning  at  (29,  1)  (later 
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moved  to  (29,  6)  to  have  it,  for  convenience,  in  the  same  display  as 
X'X).  The  four  characteristic  roots  are  on  the  far  right  (4  values 
beginning  in  29,  10).  The  5 by  5 matrix  starting  at  34,  6 contains 
the  inverse  of  X'X.  Its  last  element  is  moved  to  (1,  11)  in  order  to 
have  it  in  the  next  display.  The  setting  of  NRMAX  to  1 insures  that, 
until  a new  value  is  specified,  all  column  commands  will  operate  on 
the  first  number  in  a column  only,  but  matrix  commands  are  not 
affected. 

The  command  MSCALAR  appearing  after  the  MMOVE  shows  another  fea- 
ture which  makes  OMNITAB  so  attractive:  The  scalar  multiplier  is 
reported  as  *40,  6*  which  means  that  this  constant  is  to  be  taken 
from  the  worksheet,  the  number  appearing  at  the  (40,6)  coordinate. 

Figures  2.6.4  and  2.6.5  show  the  results  of  all  these  operations;  the 
pages  of  the  worksheet  were  displayed  after  each  operation. 

Novel  features:  A facility  to  display  segments  of  the  OMNITAB 
worksheet  immediately  after  each  operation.  A facility  to  add 
commands  to  the  system,  conforming  with  the  general  OMNITAB 
characteristics.  Addition  of  statistical  distribution  functions 
to  the  OMNITAB  commands,  and  thus  the  facility  to  construct  tables 
instantly,  especially  for  unusual  values  (e.g.,  fractional  degrees 
of  freedom  or  extreme  probability  values). 

Since  its  availability  in  1972,  this  has  been  the  most  popular 

unit.  Some  of  the  designers  of  OMNITAB  saw  this  unit  in  operation. 

It  was  reported  at  the  national  meetings  of  the  American  Statistical 

Association  in  New  York  (1973)  and  Atlanta  (1975).  Students  in  the 

Multivariate  Analysis  classes  quickly  learned  how  to  operate  this  unit 

and  could,  step-by-step,  perform  canonical-partial  correlation  problems 

and  factor  analysis  problems  in  a very  short  time  (15  to  30  minute 

sessions  for  each  student). 

It  is  only  since  the  advent  of  the  interactive  SAS  matrix  manipu- 
lations at  our  campus  (late  1976)  that  there  has  been  anything  comparable 
to  this  facility.  Adaptation  to  more  modern  equipment  has  not  yet  been 
undertaken,  since  large  instant  display  is  a very  important  feature,  and 
teletype  terminals  may  not  be  fast  enough  to  eliminate  the  elemefit  of 
user  fatigue  (see  discussion  of  the  Analysis  of  Covariance  adaptation  in 
Section  3.3). 
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2.7.  Response  Surface  Fitting 

Documentation:  THEMIS  Report  No.  37  (W.  P.  Bond  and  R.  E.  Bargmann  [43]) 
pp.  40-78,  105-176,  255-314. 

This  interactive  unit  is  designed  to  permit  study  of  response  sur- 
faces, for  up  to  four  factors  (variables)  each  with  either  2,  3,  or  4 
ordinal  levels*.  The  number  of  replications  for  each  level  combination 
(cell)  must  be  equal.  The  surface  fitted  is  thus  a tensor-product  of 
polynomials,  up  to  the  third  degree  in  each  of  the  2,  3,  or  4 variables. 

The  user  calls  $LINK  PATS  and  is  asked  to  define  and  name  variables, 
encode  levels  (if  unequally  spaced)  and  enter  data.  Figure  2.7.1  shows 
the  photograph  of  the  screen  when  data  entry  is  complete.  The  user  may 
review  the  analysis  of  variance  table  (part  of  which  is  shown  in  Fig. 
2.7.2)  and  the  sum  of  squares  (=  mean  square)  associated  with  each 
orthogonal  contrast  (Fig.  2.7.3  - the  coefficients  are  those  of  orthogonal 
polynomials  of  the  response  equation,  e.g.,  the  last  (2,2)  coefficient  is 
that  of  (3a^-2)- (3b^-2) , where  a = 1 for  high  level,  0 for  middle  level, 
and  -1  for  low  level  of  Factor  A), 

The  user  may  now  choose  to  eliminate  some  of  the  degrees-of-freedom, 
e.g.,  when  F-ratios  are  too  small  for  that  contrast,  or  when  the  degree 
of  the  corresponding  product  of  polynomials  is  too  large.  He  may  also 
wish  to  edit  data. 

After  such  modification,  if  any,  the  user  has  various  options  of 
viewing  the  response  equation  (Fig.  2.7,4).  He  may  plot  the  response 
equation  vs.  the  levels  of  one  factor,  with  the  other  factor  or  factors 
specified  at  a given  point,  which  need  not  be  one  of  the  exact  levels 
(see  Fig.  2.7.5,  where  FACB  is  set  at  2,  and  2.7.6  where  FACB  is  set 
at  1);  95%  confidence  bounds  are  displayed  around  the  response  equation. 
This  mode  of  display  is  especially  useful  for  process  evaluation  (see 
pp.  143-165  of  THEMIS  report  No.  37). 

Another  display  mode  (Fig.  2.7.7)  enables  the  user  to  view  contours 
0^  the  response  surface,  for  two  factors,  where  the  third  and  fourth 
'actors  (if  any)  are  fixed  at  a specified  level.  Fig.  2.7.8  shows  the 

levels  are  nominal,  the  analyses  of  variance  are  still  valid; 
'vsponse  equations  are,  of  course,  useless  in  that  event. 
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contour  where  the  response  is  25  (plotted  as  4)  and  26  (plotted  as  3); 
the  desired  level  of  the  response  variable  was  entered  by  the  user.  The 
attainment  of  a maximum  Is  shown  In  Fig.  2.7.9  (at  26.4  for  the  response) 
Figure  2.7.10  shows  the  algebraic  expression  of  the  response  equation. 

This  mode  of  display  Is  useful  for  locating  extrema,  ridges,  or  saddle 
points  of  response  surfaces,  and  for  studying  the  behavior  of  likelihood 
functions  Involving  two  or  more  parameters  (pp.  166-176  of  THEMIS  report 
No.  37). 

Novel  features:  Efficient  algorithms  (extended  Yates)  for 
obtaining  response  equations.  Plotting  of  projections  of 
response  equations  and  confidence  regions.  Calculations 
and  display  of  contours  for  all  levels  of  two  factors,  with 
the  others  held  fixed.  Facility  to  eliminate  Individual 
orthogonal  contrasts. 

This  unit  was  operational  on  the  370/158,  under  MVS,  In  November, 
1977.  The  figures  shown  here  are  photographs  of  the  screen  from  these 
trials  (the  displays  shown  In  THEMIS  report  No.  37  were  CALCOMP  plots 
of  the  screen  display).  Since  the  unit  was  not  fully  operational  until 
two  years  after  the  end  of  the  contract  period,  its  use  has  been  limited 
to  classroom  Instruction  in  a course  on  Information  Systems,  and  in  a 
course  on  Non-linear  Statistical  Analysis. 

2.8.  Function  Plotting  Facility 

Documentation:  THEMIS  Report  No.  37  (Bond  and  Bargmann  [43])  pp.  29-104, 
177-231,  315,336. 

This  Interactive  unit  Is  designed  to  prepare  function  plots,  espe- 
cially statistical  distribution  functions.  It  can  plot  simple  algebraic 
expressions  Involving  functions  1nthe  graphics  calculator  unit  (see 
Section  3.2).  Since  It  Is  designed  especially  for  functions  the  evalua- 
tion of  which  takes  considerab’''-  computer  time,  the  exact  evaluation  takes 
place  on  only  11  points  chosen  over  the  domain  of  the  abscissa  specified 
by  the  user;  the  rest  of  the  graph  is  obtained  by  fUting  a cubic  spline 
through  these  11  points.  In  the  region  of  final  Interest  the  precision 
of  the  function  plot  can  be  Improved  by  the  "blowup"  technique,  which 
fits  the  cubic  spline  for  a range  of  X-values  which  can  be  narrowed  at 
each  stage. 
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After  calling  $LINK  BLOWUP  the  user  receives  instructions  such  as 
those  exhibited  in  Figure  2.8.1.  In  this  example,  a variable  parameter 
was  introduced,  called  DF,  its  value  set  at  15,  30,  60,  and  120.  The 
function  is  to  be  plotted  for  values  of  X,  in  increments  of  .02,  from 
-4  to  +4  (note  that,  no  matter  what  the  desired  increments  are,  only 
11  points  are  determined  precisely  from  the  function  subprograms).  The 
function  stated  in  the  end  (illegible  in  the  photograph)  was 

Y = YORMZ(X)  - TTZ2(X,DF) 

denoting  difference  between  the  probability  density  functions  of  the 
normal  distribution  and  t distributions  with  selected  degrees  of  freedom. 
Figure  2.8.2  is  an  echo  to  the  user-supplied  input,  and  enables  the  user, 
with  the  help  of  program  function  keys,  to  change  any  line  requested. 

Figure  2.8.3  is  the  plot  for  this  input.  Notice  that  one  of  the  parameters, 
the  minimum  and  maximum  value  of  the  Y-axis,  can  be  changed  at  this  stage. 
This  facility  was  introduced  because,  frequently,  the  range  of  function 
values  must  be  guessed  at  the  first  attempt  and  usually  produces  a poor 
plot  (one  poor  plot  preceded  the  one  shown  in  Fig.  2.8.3). 

The  displays  shown  in  Fig.  2.8.4  and  2.8.5  illustrate  another  impor- 
tant feature  of  this  unit.  The  function  POWER  was  not  a member  in  the 
graphics  calculator,  and  was  included  by  a simple  technique  described  on 
pp.  89-91,  and  illustrated  on  pp.  224-231  of  THEMIS  report  No.  37.  The 
built-in  function  was  BETNCS  (see  Section  4.2)  which  evaluates  the  non- 
central beta  distribution,  but  arguments  for  the  latter  need  to  be  ob- 
tained by  a call  to  BETAP  first.  The  plots  in  Fig.  2.8.5  are  power 
functions  for  the  analysis-of -variance  test,  with  4 and  36  degrees  of 
freedom,  for  a = .05,  .01,  and  .001,  and  for  the  noncentrality  parameter 
(y  /n)  extending  from  0 to  2. 

Figures  2.8.6  and  2.8.7  illustrate  the  use  of  the  "blowup"  facility. 

It  was  desired,  for  the  a = .01  case,  to  find  that  value  of  the  non- 
centrality parameter  for  which  the  power  would  be  .90;  Fig.  2.8.6  shows 
that  it  is  between  .6  and  .65;  Fig.  2.8.7  provides  further  resolution 
and  shows  the  desired  value  to  be  very  close  to  .610. 
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Novel  features:  Plotting  of  statistical  distributions,  functions 
of  functions,  with  multiple-plot  facilities  for  values  of  a given 
parameter.  Use  of  a combination  of  very  few  precise  evaluations, 
and  a cubic  spline  fitted  through  eleven  points,  to  produce  ade- 
quate graphical  resolution.  Use  of  blow-up  facilities  to  narrow 
down  the  region  of  interest. 

Since  this  unit  was  not  operational  until  2 years  after  the  end  of 
the  contract  period,  its  use  was  restricted  to  applications  in  the  class 
room,  in  courses  on  Advanced  Scientific  Computation  and  Distribution 
Theory.  This  unit  was  operational  and  transportable  in  November,  1977, 
under  MVS  on  IBM  370/158.  The  attached  figures  are  photographs  from  the 
trial  sessions. 
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THE  GMS  SYSTEM  , ADAPTATION  , AND  TRANSPORTATION 


3.1  Load Ing  from  Tape  UOl 20 

The  partitioned  and  sequential  data  sets  required 
to  operate  the  units  described  in  chapters  1 and  2 have 
been  stored  on  a tapei  named  U0120,  at  the  Computer  Cen- 
ter of  the  University  of  Georgia.  A copy  of  this  tape, 
locally  labeled  U3115,has  been  sent  to  the  Program  Direc- 
tor of  Probability  and  Statistics,  at  the  Office  of  Naval 
Research. 

Description  of  Files i 
File  li  SYSl.GRAPHLIB 

Member  Names i i‘ACTIVE,ANACOV,BASIC,BLOWUP,CALCG, 
CALCO, COMAP , CURVEFIT , DINTRP , DINTRQ , DUMP , ELLIPSE , 
FILETST , FORECAST , GENPLT , GWRITERB , GWRITERL , HOPES , 
LARGE , MAXL , MINGY , MODEL , MULPL , MXCTROL , OMTAB , PATS , 
PLOTF , PRODFLOW , QUANTAL , RATIO , REGRES , RESCALCB , 

RES 2250 , RES 2250B , SDES , SPLINE , SPLINEP , SPOOK , SURF , 
UGUESS IT , WE IBUL , XCTROL 

Purposei  These  are  the  load  modules  of  the  inter- 
active units;  this  is  the  data  set  which  gets  modi- 
fied when  units  are  changed  or  added.  Status  of 
November, 1977 . 

File  2 1 SYS2.GRAPHLIB 

Member  Names i QUANBUG, RESCALCB, SHOWFIL 

Purposei  Load  modules  used  for  experimentation. 

File  3i  TP. LOAD 

Member  Names i DEPTS,GETALL,GTR3435, LABELS, LDSAVE, 
LOGCNUTL , MXCTROL , PREPROM , PTPCH , SAMPS  2 , SESS  IONS , 
TPCHARTS , TSOSMF, TSOSMF2 , UADSLIST 
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File  3(cont'd)i  Purpose!  Utility  and  monitoring 
routines. 
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File  4i  SYSl.SSPLIB 

Members!  IBM  Scientific  Subroutine  Package 

File  5!  SYS1.EXT2E0A 

Random  access  file?  contains  displays 

File  6!  SYS1.W2250 

FT06F001  and  SYSPRINT;  for  diagnostics  (GWRITERB) 

File  7t  PLOTA 

When  desired,  in  a given  session,  produces  image  of 
individual  displays,  to  be  processed  by  CALCCMP  plot- 
ter. 

File  8 I PLOTB 

Same  as  File  7,  but  for  permanent  storage  for  two  or 
more  sessions. 

File  9!  SYSl.GMSLIB 

Member  names i $INITG$, BUILD, EOBINT, FETCH, GBUSY, 
GFIELD , GPOS T , GR IN IT , GWAIT , INDEX , INITP , INK , INX , LINGEN , 
LPINT , PFINT , SCOPLT ,STREAL , WAITG , XBLANK, XBLANKS , 
YENTEST 

Utility  routines  required  to  add  or  modify  sets  on 
SYSl.GRAPHLIB  (see  THEMIS  Report  No.  14  (Penn  [18])) 

Files  10  , 11  , and  12!  Various  examples  and  diagnostics, 
not  needed  for  use  of  the  system. 

Organisation  and  formats  of  these  data  sets  are  shown 
in  Tables  3.1.1  and  3.1.2  , 
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TABLE  3.1,1 

DATA  SETS  ON  THE  TAPE 
OF  U0120 


FILE 

NO. 

NAME  OF  DATA  SET 

DSORG 

UNIT  AT 
UGA . C . C . 

VOLUME  AT 
UGA . C . C . 

1 

UGA . SYS 1 . GRAPHLIB 

PO 

3330 

UGD04a 

2 

UGA . SYS2 . GRAPHLIB 

PO 

3330 

UGS004 

3 

TP . LOAD 

PO 

2314 

UGCPSl 

4 

SYSl.SSPLIB 

PO 

3330 

UGALBl 

5 

UGA.SYS1.EXT2EOA 

DA 

3330 

UGS004 

6 

UGA.SYS1.W2250 

PS 

2314 

UGD04B 

7 

UGA . PLOTA 

PS 

3330 

UGS004 

8 

UGA.PLOTB 

PS 

3330 

UGS004 

9 

UGA.SYSl.GMSLIB 

PO 

3330 

UGD04B 

10 

JCL  TO  RUN  GMS  FOR 
THE  DSN  CHANGED 

PS 

11 

JCL  TO  RESTORE  DATA 
SET  FROM  TAPE  TO 

DISK 

PS 

12 

DOCUMENTATION  OF 

HOW  TO  USE  THE  IBM 
2250  AND  GMS. 

PS 

13 

ABSTRACT  DIRECTION 

OF  THE  PROGRAMS  IN 
GMS 

PS 

TABLE  3.1.2 


DCB  FOR  EACH  DATA  SET 
OF  GMS 


NAME  OF  DATA  SET 

SPACE  USED 
(TRKS) 

RECFM 

LRECL 

BLKSIZE 

UGA.SYSl.GRAPHLIB 

350 

U 

•k  k 

7294 

UGA . S YS2 . GRAPHLIB 

22 

U 

k k 

2000 

TP . LOAD 

92 

U 

k k 

3625 

SYSl.SSPLIB 

72 

U 

kk 

2000 

UGA.SYS1.EXT2EOA 

3 

F 

400 

400 

UGA.SYS1.W2250 

1 

FBA 

121 

121 

UGA.PLOTA 

1 

UBS 

364 

2000 

UGA . PLOTB 

2 

UBS 

364 

2000 

UGA.SYSl.GMSLIB 

21 

U 

* ★ 

7192 

Table  3.1.3 
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JCL  for  the  Loading  of  Data  sets  from  U0120i 

//TTTT  JOB( account, password) , 'LUP ' ,MSGLEVEL=1 ,CLASS=B , 

REGION=l 60K, PRTY=9 , TIME=1 0 
//*MAIN  SYSTEM=SY2,LINES  = 100 
//SE  EXEC  PGM=IEBCOPY 
//SYSPRINT  DD  SYSOUT=A 

//A  DD  DSN=TAPE.GRAPH1  ,lJNIT=2A00,VOL=SER=U0120,DCB=DEN=2, 

//  DISP=OLD,LABEL=(01,NL) 

//B  DD  DSN=UGA.SYS1.GRAPHLIB,UNIT=3330,DISP=( ,CATLG) , 

//  SPACE=(CYL,(50,1,20)) 

//SYSUT3  DD  UNIT=3330,SPACE=(CYL,(1,1)) 

//SYS IN  DD  * 

XXXX  COPY  OUTDD=B,INDD=A 
//SA  EXEC  PGM=IEBCOPY 
//SYSPRINT  DD  SYSOUT=A 

//A  DD  DSN=TAPE.GRAPH2,UNIT=2400,VOL=SER=U0120,DCB=DEN=2, 

//  DISP=(OLD,PASS),LABEL=(02,NL) 

//C  DD  DSN=nGA.SYS2.GRAPHLIB,UNIT=3330,DISP=(,CATLG), 

//  SPACE=(CYL,( 1,1,1)) 

//SYSUT3  DD  UNIT=3330,SPACE=(CYL,(1,1)) 

//SYS  IN  DD  * 

XXXX  COPY  OUTDD=C,INDD=A 
//SB  EXEC  PGM=IEBCOPY 
//SYSPRINT  DD  SYSOUT=A 

//A  DD  DSN=TAPE.TPLO,UNIT=2400,VOL=SER=U0120,DCB=DEN=2, 

//  DISP=(OLD,PASS),LABEL=(03,NL) 

//D  DD  DSN=UGA.TP.LOAD,UNIT=3330,DISP=(,CATLG), 

//  SPACE=(CYL,(1,1,4)) 

//SYSUT3  DD  UNIT=3330,SPACE=(CYL,(1,1)) 

//SYS  IN  DD  * 

XXXX  COPY  OUTDD=D,LNDD=A 
//SD  EXEC  PGM=IEBCOPY 
//SYSPRINT  DD  SYSOUT=A 

//A  DD  DSN=TAPE.SSPL,UNIT=2400,VOL=SER=U0120,DCB=DEN=2, 

//  DISP=(OLD,PASS) ,LABEL=(04,NL) 

//F  DD  DSN=UGA.SYS1.SSPLIB,UNIT=3330,DISP=(,CATLG), 

//  SPACE=(CYL,(1,1 ,90)) 

//SYSUT3  DD  UNIT=3330,SPACE=(CYL,(1,1)) 

//SYS  IN  DD  * 

XXXX  COPY  ajTDD=F,INDD=A 

//SJ  EXEC  PGM=IEBGENER,REGION=450K 

//SYSPRINT  DD  SYSOUT=A 

//SYS  IN  DD  DUMMY 

//SYSUTl  DD  DSN=TAPE.EXT2,UNIT=2400,VOL=SER=U0120,DISP=(OLD,PASS) , 
//  LABEL=(05,NL) ,DCB=(RECFM=F, LRECL=400, BLKSIZE=400 ,DEN=2) 

//SYSUT2  DD  DSN=UGA.SYS1.EXT2E0A,UNIT=333O,DISP=(,CATLG), 

//  SPACE=(CYL,(1,1)) 

//SI  EXEC  PGM=IEBGENER,REGION=450K 
//SYSPRINT  DD  SYSOUT=A 
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(JCL  for  loading  of  data  sets  from  U0120,  cont'd) 

//SYS IN  DD  DUMMY 

//SYSUTl  DD  DSN=TAPE.W2250,UNIT=2400,VOL=SER=U0120,DISP=(OLD,PASS), 
//  LABEL=(06,NL) ,DCB=(RECFM=F,LRECL=121 ,BLKSIZE=121 ,DEN=2) 

//SYSUT2  DD  DSN=  UQA. SYS 1 .W2250, UNIT-3330, DISP-( ,CATLG) , 

//  SPACE=(CYL,C1,1)) 

//SG  EXEC  PGM=IEBGENER,REGION=450K 
//SYSPRINT  DD  SYSOUT=A 
//SYS IN  DD  DUMMY 

//SYSUTl  DD  DSN=TAPE.PLA,UNIT=2400,VOL=SER=U0120,DISP=(OLD,PASS), 
//  LABEL=(07,NL) ,DCB=(RECFM=U,BLKSIZE=2000,LRECL=2000,DEN=2) 

//SYSUT2  DD  DSN=UGA.PLOTA,UNIT=3330,DISP=(,CATLG) ,SPACE=(CYL,(1,1)) 
//SH  EXEC  PGM=IEBGENER,REGION=450K 
//SYSPRINT  DD  SYSOUT-A 
//SYS IN  DD  DUMMY 

//SYSUTl  DD  DSN=TAPE.PLB,UNIT=2400,VOL=SER=U0120,DISP=(OLD,PASS), 

//  LABEL=( 08 ,NL) , DCB=(RECFM=U , BLKS IZE=2000 , LRECL=2000 ,DEN=2) 

//SYSUT2  DD  DSN=UGA.PLOTB,UNIT=3330,DISP=(,CATLG) , 

//  SPACE=(CYL,(1,1)) 

//SQ  EXEC  PGM=IEBCOPY 
//SYSPRINT  DD  SYSOUT=A 

//A  DD  DSN=TAPE.GMSLIB,UNIT=2400,VOL=SER=U0120,DCB=DEN=2, 

//  D1SP=(,PASS) ,LABEL=(09,NL) 

//D  DD  DSN=UGA.SYS1.GMSLIB,UNIT=3330,DISP=(,CATLG), 

//  SPACE=(CYL, (1,1,4)) 

//SYSUT3  DD  UNIT=3330,SPACE=(CYL,(1,1)) 

//SYS  IN  DD  * 

XXXX  COPY  OUTDD=D,INDD=A 
/* 


Table  3.1.4  JCL  for  Execution 


JCL  TO  RUN  GMS  ON  UNIV.  OF  GEORGIA  (MVS)  NOV.  1977 

//  A JOB  CARD 

//♦MAIN  SYSTEM=SY2,LINES=20 
//  EXEC  PGM=MXCTROL 

//GO.STEPLIB  DD  DSN=SYS1 . PLILIB , DISP=SHR 
//  DD  DISP=SHR,DSN=TP.LOAD 

//  DD  DSN=UGA.SYS1.GRAPHLIB,DISP=SHR 

//  DD  DISP=SHR,DSN=UGA.SYS2.GRAPHLIB 

//GO.FTllFOOl  DD  DISP= (, DELETE) , 

//  DCB=( ,BLKSIZE=0164,LRECL=0160,RECFM=VBS) , 

//  UNIT=NOSHR,SPACE=(CYL,  (1,1) ) 

//GO.FT12F001  DD  DISP= (, DELETE) , 

//  DCB= ( ,BLKSIZE=0804 ,LRECL=0800,RECFM=VBS) , 

//  UNIT=N03HR,SPACE=(CYL, (1,1)  ) 

//GO.FT14F001  DD  DUMMY 
//GO.FT60F001  DD  DUMMY 

//GO.FT06F001  DD  PSN^UGA. SYSl . W2250 , DISP=SHR 
//GO.SYSPRINT  DD  DSN=UGA. SYSl . W2250 ,DISP=SHR 
//GO.WORKDD  DD  UNIT=3330 , SPACE= (CYL , ( 2 ) , ,CONTIG) , 

//  DCB= (DSORG=DA,RECFM=F,LRECL=3330,BLKSIZE=3330) 

//GO.FT05F001  DD  DSN=UGA. SYSl . R2250 , DISP=SHR 
/ GO.FT04F001  DD  DUMMY, 

//  DCB= (BUFNO=l,RECFM=VB,LRECL*84,BLKSIZE=848) 

//GO.FT20F001  DD  UNIT=NOSHR, SPACE- (CYL, 1) , 

//  DCB= (DSORG=PS,LRECL=32,RECFM=FA) 


♦ PL/1  Library 
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3.1.4  iContinued) 

//GO.FT18F001  DD  UNIT=NOSHR,SPACE= (088 , (90)  , ,CONTIG, ROUND)  , 

//  DCB= (DSORG=DA,RECFM=VBS) 

//GO.FT19F001  DD  UNIT=NOSHR,SPACE= (208 , (150) , ,CONTIG, ROUND) , 

//  DCB= (DSORG=DA,RECFM=VBS) 

//GO.FT26F001  DD  UNIT=NOSHR, SPACE= (CYL,  ( 1 , 1)  , , CONTIG)  , 

//  DCB= (DSORG=DA,RECFM=VBS) 

//GO.FT27F001  DD  UNIT=NOSHR, SPACE= (CYL, ( 1 , 1) , ,CONTIG) , 

//  DCB= (DSORG=DA,RECFM=VBS) 

//GO.FT28F001  DD  UNIT=NOSHR, SPACE= (CYL , ( 1 , 1 ) , ,CONTIG)  , 

//  DCB=(DSORG=DA,RECFM=VBS) 

//GO.FT21F001  DD  UNIT=NOSHR,DISP= (, DELETE) , 

//  SPACE= (CYL, (1, 1) , ,CONTIG) , 

//  DCB=( ,BLKSIZE=0133,LRECL=0126,RECFM=F) 

//GO.FT22F001  DD  UNIT=NOSHR,DISP= (, DELETE) , 

) 

//  SPACE=(CYL, (1,1) , ,CONTIG) , J 

//  DCB= ( ,BLKSIZE=0133,LRECL=0126 ,RECFM=F) 

//GO.FT16F001  DD  UNIT=3330 ,SPACE= (088 , (2000) , ,CONTIG, ROUND) , 

//  DCB= (DSORG=DA,RECFM=VBS) 

//GO.FT15F001  DD  UNIT=3330 ,DISP= (, DELETE) , 

//  SPACE= (CYL, (1, 1) , ,CONTIG) 

//GO.FT17F001  DD  UNIT=3330 , SPACE= ( 088 ,( 2000) ,, CONTIG , ROUND) , 

//  DCB= ( DSORG=DA , RECFM=VBS ) 

//GO.FT31F001  DD  UNIT=NOSHR, DISP= ( ,DELETE) , SPACE= (CYL , 1) , 

//  DCB= (RECFM=FB,LRECL-80,BLKSIZE=80) 


1*  V 
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//GO.FT30F001  DD  UNIT=3 330 , DISP= (, DELETE ), SPACE= (CYL ,( 1 , 1) ) , 
//  DCB= {LRECL=200 , BLKSIZE=816 , RECFM=VBS) 

//GO. SYS ABEND  DD  SYSOUT=A 
//GO. SNAP  DD  DUMMY 

//GO.GRAPHLIB  DD  DSN=UGA . SYSl . GRAPHLIB , DISP=SHR 
//  DD  DISP=SHR,DSN=UGA.SYS2. GRAPHLIB 

//GO.SYSUPLOT  DD  DUMMY 
//GO. DISPLAY  DD  UNIT=2Fl 
//GO.GRAPHl  DD  UNIT=AFF=DI SPLAY 
//GO.FT38F001  DD  DSN=UGA. SYSl . MVAP , DISP=SHR 
//GO.FT23F001  DD  DSN=UGA . SYSl . EXT2E0A, DISP=SHR 
//GO.FT08F001  DD  DSN=UGA . PLOTA , DISP=SHR, 

//  DCB= (RECFM=VBS , LRECL=364 , BLKSIZE=2000) 

//GO.SYSUT2  DD  DSN=UGA. PLOTB ,DISP=SHR, 

//  DCB=(RECFM=VBS,LRECL=364,BLKSIZE=2000) 
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3.2.  Introduction  to  the  Subroutines  of  the  GHS  Library  * 

UGA.SYSl  .GRAPHLIB  is  a load  module  Partitioned  Data  Set 
(PDS)  containing  all  of  the  subroutines  for  the  GMS  library. 
This  PDS  has  been  unloaded  to  the  first  file  of  the  tape 
volume  U0120  using  the  OS  Utility  lEBCOPY.  This  tape  is  800 
BPI,  9 track  and  non-labeled.  The  length,  in  bytes,  and 
the  n2une  of  each  subroutine  is  included  in  Table  3.2.1. 

PURPOSE 

The  library  of  GMS  consists  of  many  mathematical  and 
statistical  programs  needed  to  perform  statistical  analyses 
from  the  graphics  terminal.  All  subroutines  have  been 
compiled  through  the  FORTRAN  and  Assembler  F language  pro- 
cessors. To  execute  a given  unit,  the  user  answers  questions, 
enters  his  data,  observes  his  output  and  often  has  some 
control  over  the  sequence  of  tasks  performed.  Every  command 
is  given  as  an  explicit  instruction  in  the  display. 

METHOD 

When  you  see  a display  on  the  IBM  2250  screen  beginning: 
”*  GRAPHICS  MONITOR  SYSTEM  * THE  GRAPHICS  DEVICE  CAN  BE 

USED  BY it  means  that  GMS  has  started  operation. 

Then  you  follow  the  instruction  on  the  display,  and  depress 

any  one  of  the  lighted  PFK's. 

* From  Yen,  Ming,  "Transportability  of  Interactive  programs",  unpublished 
MS  thesis,  1977. 
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The  user  should  simply  follow  the  instruction  shown  on 
the  screen.  He  begins  by  pressing  any  of  the  Programmed 
Function  Keys  (PFK)  except  PFK  #0.  Next  he  either  types 
$NAMES  to  see  a list  of  the  programs  available,  or  he 
types  $LINK  prog,  where  prog  is  the  name  of  the  program  he 
wants  to  use.  In  either  case  the  typed  response  is  followed 
by  an  EOB  signal,  i.e.,  depressing  both  the  ALT  key  and  the 
5 key  together.  If  the  user  types  $NAMES  and  sees  a list 
of  progreun  neunes,  he  may  continue  with  $LINK  prog  as  de- 
scribed later. 

The  user  may  be  asked  by  a prograun  to  respond  in  any 
of  3 ways: 

1.  Type  a response  followed  by  an  EOB.  Any  response 
entered  by  the  typewriter  is  followed  by  the  EOB 
sequence. 

2.  Press  the  appropriate  PFK  key  on  the  panel  as 
directed  by  the  program.  Some  units  make  this 
easier  by  lighting  the  keys  which  have  special 
functions  at  any  given  state. 

3.  Place  the  tip  of  the  light  pen  on  the  screen  at 
the  appropriate  lighted  area  and  operate  the  foot 
pedal . 

Some  features  common  to  most  units  are: 

1.  PFK  31  is  the  "panic  button",  it  terminates  most 
programs.  Press  PFK  31,  when  you  want  to  termi- 
nate or  you  don't  know  what  else  to  do. 

2.  Even  if  you  are  familiar  with  the  progreun,  do  not 


kJ 
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answer  questions  before  they  are  asked.  If  you 
accidently  do  this,  it  may  be  necessary  to  press 
key  31  to  terminate  and  then  relink  your  program. 
The  responses  are  queued  and  mismatch  may  occur  if 
a response  is  given  out  of  sequence. 

3.  You  can  return  to  the  beginning  of  many  programs 
by  pressing  key  30. 

DESCRIPTION  OF  PACKAGE  PROGRAM 

We  can  divide  all  the  subroutines  into  the  following 
four  parts; 

1.  Initialization. 

2.  Graphics  Calculator  Mode. 

3.  Statistical  Conversational  Units  for 

• Exploratory  Research 

• Data  Analysis 

• Student  Program  Checkout 

4.  Graphics  Utilities: 

• GWRITERB 

• GWRITERL 

Table  3.2.2  is  included  below  because  it  contains  an 
easy-reference  summary  of  the  subroutines  in  the  GMS 
library. 

1.  INTIALIZATION 
MXCTROL  see  Ref.  [18] 

MXCTROL  constitutes  the  Master  Control  Task  for  the 
GMS.  A mini-monitor  system,  MXCTROL  is  the  first  module 


INITIALIZATION 
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TABLE  3.2.2 

PROGRAMS  SUMMARY  OF  SUBROUTINES  IN  GMS  LIBRARY 


SUBROUTINE 

NAME 

PURPOSE 

EFFECT  OF 
PFK  KEYS 

MXCTROL 

Constitutes  the  master  con- 
trol task  for  GMS.  MXCTROL 

is  the  first  module  loaded 

and  performs  the  following 

steps : 

• Write-to-operator-with-reply 

• Attach  XCTROL 

None 

• Enter  'wait'  state  until  the 

operator  types  a reply  or 

the  sub-task  terminates . 

XCTROL 

Receives  control  from  MXCTROL 

#0  thru 

' 

via  the  ATTACH  macro,  ini- 
tializes and  writes  first 

display  on  the  screen.  Pro- 
cesses $L1NK,  $NAMES,  $RESET, 
$END  commands . 

#31 

BASIC 

Produce  basic  character  size 

None 

for  COMAP. 

LARGE 

Produce  basic  character  size 

None 

for  COMAP. 

COMAP 

Enable  the  Assembler  language 
programmer  to  utilize  the  IBM 
2250  graphics  terminal  as  a 

convers  itional  tool. 

None 

TABLE  3.2.2  (continued) 
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subroutine: 

EFFECT  OF 

STEP 

NAME 

PURPOSE 

PFK  KEYS 

CALCG 

The  graphics  calculator 
is  a single  statement 
interpreter . 

None 

g 

Statements  are  written 

in  FORTRAN- like  syntax. 

« 

O 

CALCO 

Plots  as  a function  of 

#0,  #4, 

H 

3 

X any  expression  currently 

#5,  #6, 

3 

available  in  the  calcula- 

#7,  #8, 

u 

tor  mode. 

#9,  #31, 

U) 

#20,  #10, 

u 

M 

X 

* 

#30 

cu 

PLO^J- 

Plots  as  a function  of  X 

Same  as 

o 

any  expression  currently 

CALCO 

H 

available  in  calculator 

H 

mode  too . 

' 

ANACOV 

Analysis  of  covariance  on 

#0,  #15, 

H 

one  response  variable  and 

#30,  #31. 

Z 

D 

up  to  9 concomitant  vari- 
ables. One  or  two  grouping 

§ 

factors  with  up  to  12 

M 

E-i 

levels  each. 

< 

ELLIPSE 

Displays  clusters  by  pro- 

#0,  #31. 

jecting  them  on  various 

o 

u 

two  dimensional  subspaces. 

cn 

U 

M 

fr> 

FORECAST 

Second  part  of  Leontieff 

#0,  #29, 

input-output  analysis. 

#30,  #31. 

U) 

M 

Obtains  rav/  material 

t: 

< 

requirements  from  projected 

CA 

variations  and  correla- 

M 

M 

tions  of  demand.  Network 

H 

built  by  "MODEL". 

III.  STATISTICAL  CONVERSATIONAL  UNIT 


TABLE  3.2,2  (continued) 
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SUBROUTINE 

NAME 

PUSPOSE 

EFFECT  OF 
PFK  KEYS 

RATIO 

Investigates  the  accuracy 
of  empirical  estimators 
of  the  parameters  P and  K 

from  the  negative  binomial 

distribution . 

None 

REGRES 

Computes  and  displays 
estimated  least  squares 
regression  line  for  up 
to  40  data  points. 

SPLINE 

Plot  a spline  function 

#1,  #3 

thru  3 to  50  points.  Can 
be  smoothed  by  use  of 

light  pen. 

light  pen 

SPOOK 

Multivariate  analysis  of 

#1,  #2, 

variance  of  irregular 

#3,  #5, 

data,  with  up  to  10 
response  variables  and 

2 grouping  factors  with 
up  to  12  levels  each. 

#6. 

WEIBUL 

Estimates  the  parameters  of 

#4,  #5, 

a 2-parameter  tVEIBUL 
distribution  using 

order  statistics  and 

modified  bioassay  tech- 
niques . 

#10,  #11. 

PATS 

Performs  analysis  of 

#1,  #2, 

variance  for  any  complete 

#4,  #3, 

factorial  design  of  the 

#5,  #28, 

form  2*^3®4'^  < 513  where 

#29,  #30, 

R+S+T  < 4.  Obtains  response 

#1,  #2, 

equations  and  contour  maps. 

#3. 

III.  STATISTICAL  CONVERSATIONAL  UNIT 
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TABLE  3.2.2  (continued) 


SUBROUTINE 

NAME 

PURPOSE 

EFFECT  OF 
PFK  KEYS 

MODEL 

First  section  of  the  input- 
output  analysis  program. 
Designed  to  assist  you  in 
constructing  a model  of 

the  network  flow  used  in 
forecasting,  (see  FORECAST) 

None 

OMTAB 

A column  and  matrix  oriented 

computing  system  (OMNITAB- 

National  Bureau  of  Standards) 

which  is  particularly  good 

for  matrix  calculations 

Numerical  values  are  stored 

None 

in  a 12  section  80  by  30 

worksheet  which  may  be 
inspected  at  any  time. 

PRODFLOW 

Product  flow  analysis,  where 

#20,  #1, 

the  user  supplies  the  service 

#31,  #5, 

properties  of  a queue.  Out- 

#25,  #26, 

put  may  be  a statistical 

#27,  #28, 

distribution  of  queue  lengths 

#29,  #30, 

and  various  optimization 
designs  for  service. 

#10. 

QUANTAL 

Analysis  of  Quantal  response 

#1,  #2, 

data.  (PROBIT,  LOGIT, 

#4,  #5, 

WEIBULL,  ARCSINE,  etc.) 

#6,  #7, 

* 

#8,  #9, 

#10,  #30, 

#31. 

TABLE  3.2.2  (continued) 


113 


SUBROUTINE 

EFFECT  OF 

STEP 

NAME 

PURPOSE 

PFK  KEYS 

•o 

(U 

3 

BLOWUP 

Fitting  statistical  dis- 

#1,  #2, 

C 

•H 

tributions,  power  func- 

#3. 

c 

0 

tions,  with  plots  of  such 

III.  (c 

functions  and  blowup. 

GWRITERB 

Takes  an  input  sequential 

#1,  #28, 

data  set,  builds  a random 

#29,  #30, 

access  data  set  as  an 

#31  . 

intermediate  data  set. 

w 

and  displays  the  contents 

u 

M 

of  the  random  access  data 

E-i 

M 

set  by  page  number  upon 

M 

request  by  the  2250  user. 

3 

* 

W 

It  uses  basic  sized 

u 

M 

characters . 

X 

A 

GWRITERL 

Takes  an  input  sequential 

Same  as 

« 

o 

data  set,  builds  a random 

above 

• 

M 

access  data  set,  and 

> 

displays  the  contents  of 

the  random  access  data 

set  by  page  number  upon 
request  by  the  2250  user'. 

It  uses  large  sized 

characters . 
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loaded  and  performs  the  following  steps: 

• Write-to-operator-with-reply 

• Attach  XCTROL 

• Enter  'wait'  state  until  the  operator  types  a reply 
or  the  sub-task  terminates 

If  the  sub-task  terminates  or  the  operator  replies 
'ROMS',  a DETACH  macro  is  issued  and  the  above  steps  are 
repeated.  If  the  operator  replies  'XGMS'  the  MXCTROL 
terminates  itself. 

XCTROL  see  Ref.  [18] 

XCTROL  receives  control  from  MXCTROL  via  the  ATTACH 
macro,  which  means  that  XCTROL  is  the  daughter  task  and 
MXCTROL  is  the  mother  task  in  an  Multi-Tasking  Environment. 
XCTROL  initializes  and  writes  the  first  display  on  the 
2250  screen.  If  the  2250  device  power  is  off  at  the  time, 
the  following  message  will  be  written  on  the  computer 
operator's  console  seven  times. 

"*IEA000  ADR, I/OERR,** ,0200 ,4000 , #@$GMS" 

Where  ADR  is  the  unit  address,  in  hexadecimal,  of  the 
2250  device,  and  the  last  field,  #@$GMS  is  the  job  name  for 
GMS,  both  fields  being  installation-dependent.  The  above 
message  is  generated  by  the  I/O  supervisor,  and  the  explana- 
tion for  the  seven  occurrences  is  that  XCTROL  performs 
seven  initial  I/O  sequences  to  the  2250  display  regardless 
of  its  ON/OFF  condition. 

XCTROL  then  goes  into  a wait  state,  until  any  one  of 
the  keys  on  the  PFK  is  depressed.  If  PFK  #0  is  depressed 
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first,  XCTROL  is  terminated  with  a user  completion  code  of 
UlOOO,  the  effect  being  that  the  initial  display  is  removed 
from  the  screen. 

The  audible  alarm  sounds,  and  the  same  display  comes 
back  on  the  screen.  In  reality  XCTROL  terminates,  returns 
control  to  MXCTROL,  is  detached  and  reattached  by  MXCTROL, 
at  which  time  XCTROL  begins  its  initialization  sequence 
again. 

For  an  attention  interrupt  resulting  from  any  PFK 
other  than  key  #0.  XCTROL  comes  out  of  the  'wait*  state  and 
continues  with  the  next  display. 

The  2250  screen  is  partitioned  into  2 sections — the 
larger  section  called  the  Output  Area,  and  the  smaller  sec- 
tion called  the  Reply  Area.  Control  of  the  2250  and  above- 
mentioned  screen  format  is  provided  by  COMAP,  a Conversa- 
tional Macro  Package  for  the  IBM  2250  assembler-language 
programmer. 

The  cursor  appears  in  the  first  position  in  the  Reply 
Area  and  indicates  where  the  next  character  typed  from  the 
alphameric  keyboard  will  appear.  Both  visibly  on  the  screen 
and  invisibly  in  the  buffer  to  the  2250  the  EOB  sequence  causes 
an  attention  interrupt  at  the  360  or  370. 

The  2nd  display  depicts  the  following  special  commands 
that  are  acceptable  to  XCTROL. 

$LINK,  $NAMES,  $RESET,  $END 
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PROCESSING  THE  $LINK  COMMAND 


A model  command  is  $LINK  neime,  where  'name'  is  sup- 
plied by  the  2250  operator,  i.e.  $L1NK  CALCG,  'NAMES'  is 
the  first  compared  with  each  of  the  following: 

BASIC,  COMAP,  LARGE,  XCTROL,  MXCTROL 
These  names  represent  protected  GMS  control  modules  and 
cannot  be  loaded  and  executed  by  the  2250  operator.  If  a 
match  occurs,  then  the  message: 

***  NAME  IS  A PROTECTED  'GMS'  PROCESS  *** 


I 

1 


! 


is  displayed  on  the  2250  screen,  followed  by 
***  READY  FOR  COMMAND  *** 

which  signals  to  the  2250  operator  that  XCTROL  is  awaiting 
a command  to  be  typed  in  from  the  alphameric  keyboard. 

Otherwise,  the  directory  of  the  graphics  library  is 
searched  for  a match.  This  search  is  accomplished  by  the 
BLDL  macro.  If  no  match  occurs  for  this  search,  then  the 
message 

***  NAME  NOT  FOUND  IN  GRAPHICS  LIBRARY  *** 
is  displayed  on  the  screen,  and  XCTROL  awaits  another  com- 
mand from  the  2250  operator  as  described  earlier. 

If  the  directory  search  produced  a match  return 
code*0  from  BLDL,  then  the  2250  is  released  (RLSEG  macro 
is  executed)  and  the  selected  module  (operand  of  the  $LINK 
command)  is  loaded-and-executed  via  the  XCTL  macro,  which 
means  that  XCTROL  is  replaced  in  memory  by  the  selected 
load  module,  economizing  on  the  use  of  core  storage  for 
GMS. 
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PROCESSING  THE  $NAMES  COMMAND 

A model  command  is  $NAMES  and  the  result  is  a new 
display  depicting  all  the  program  names (members)  presently 
in  the  graphics  library. 

These  names  are  read  and  extracted  from  the  directory 
of  the  GMS  library  by  XCTROL  which  displays  three  names  per 
line  on  the  2250. 

PROCESSING  THE  $RESET  COMMAND 

A model  command  is  $RESET  and  the  result  is  the  erasing 
of  the  present  display  on  the  screen  with  the  next  display 
being  the  2nd  display  generated  by  XCTROL. 

PROCESSING  THE  $END  COMMAND 

A model  command  is  $END  and  the  result  is  the  termina- 
tion of  XCTROL. 

PROCESSING  AN  ILLEGAL  COMMAND 

If  the  command  is  unintelligible  to  XCTROL,  the 
message : 

***  ILLEGAL  COMMAND  *** 
is  displayed  on  the  screen. 

Each  message  typed  in  by  the  2250  operator  is  imme- 
diately displayed  (up  to  40  characters) , followed  by  the 
action  as  described  earlier.  For  the  processing  of  all 
accepted  commands,  with  the  exception  of  the  $END  command, 
XCTROL  concludes  its  processing  by  re-displaying  the 
message : 

***  READY  FOR  COMMAND  *** 
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on  the  screen  indicating  to  the  2250  operator  that  XCTROL 
is  done  with  its  processing  and  is  expecting  a reply  at 
the  time. 

BASIC  see  fief.  (18] 

BASIC  produces  basic  sized  display  characters  for 

COMAP. 

LARGE  see  Ref.  [18] 

LARGE  produces  large  sized  display  characters  for 
COMAP. 

COMAP  (Conversational  Macro  Package)  see  Ref.  [4] 

COMAP  enables  the  assembler  language  programmer  to 
utilize  the  IBM  2250  graphics  device  as  a conversational 
tool  with  a minimum  of  programming  effort.  The  following 
services,  normally  required  of  programmers  utilizing  the 
problem-oriented  routines  supplied  by  IBM  (GPS) , are 
provided  by  COMAP. 

• Automatic  control  and  formatting  of  the  output 
display. 

• Dynamic  memory  allocation  and  queuing  of  input 
message  from  the  alphameric  keyboard.  Asynchronous 
to  program  execution. 

• Buffer  management  in  the  graphics  control  unit  and 
handling  of  all  attention  interrupts  from  the  2250 

1 device. 

The  major  design  criteria  of  COMAP  was  to  provide  a 

I * 

I convenient  means  for  the  assembler  language  prograunmer  to 

[ utilize  the  2250.  The  following  is  a summary  of  the  COMAP 
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TABLE  3.2.3 

THE  CHARACTERISTICS  OF  EACH  CHARACTER  SIZE 


CHARACTERISTICS 


Maximum  number  of  charac- 


ters per  line 

• Lines  in  output  area 

• Lines  in  reply  area 

• Characters  in  output  area 

• Characters  in  reply  area 

• Load  module  storage 
estimate 


CHARACTER  SET 
BASIC  LARGE 


3404 


3669 


1421 


1642 


(1)  Subtract  3 for  random  message  processing  to  allow  for 
the  ID  (2  digits  plus  a blank) 

(2)  This  storage  estimate  reflects  the  size  of  the  graphics 


display  buffer  maintained  by  COMAP  (including  62  bytes 
for  a linkage  table) 
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macro  instructions: 


symbol 

BKSPG 

[NC*=dd  ] 

symbol 

CTRLG 

PFK»(  [pfkmask ,pfkaddr I) , 

(LP*lpaddr  ] [ , ALARM=YES/no  ) 

symbol 

DPLYG 

NC=dd , BA=bf raddr [ , ID=IADDR ] 

symbol 

ERASE 

[NL=dd) 

symbol 

INITG 

[SIZE=BASIC/LARGE] 

symbol 

RLSEG 

symbol 

RPLYG 

NC=dd , BA=bf addr [ , ID=idaddr ] 

2.  Graphic  Calculator  Mode 

CALCG  see  Ref. 

[18] 

CAIiCG  is  a single  statement  interpretive  processor, 
the  statements  are  written  in  FORTRAN- like  syntax  with  the 
following  rules: 

a)  Statements  are  scanned  left  to  right  and  the  5 arith- 
metic operations  as  follows: 

"+"  for  addition 

for  subtraction 
for  multiplication 
"/"  for  division  and 
"**"  for  power 

Assume  no  pre-determined  hierarchy  in  evaluating  the 
statement.  In  other  words,  each  statement  is  scanned  (left 
to  right)  as  a single  expression  and  evaluated  as  if  it 
were  a "polish  stack"  in  memory. 
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EXAMPLES;  Y=A+B*2  means  Y=(A+B)*2 

Y=A+B*2+C*3  means  Y= ( ( (A+B) *2) +C) *3 

b)  The  three  allowable  terms  in  an  expression  are  the 
following: 

1)  Constant  (integer,  fixed  point,  or  floating  point 
format) 

2)  Symbol  (8  or  fewer  characters  in  each  variable  name) 

3)  Function  (The  available  functions  will  be  displayed 
on  the  screen  upon  request  by  the  2250  operator,  as 
described  later  in  this  section.) 

c)  Only  one  term  is  allowed  within  a single  set  of  paren- 
theses. In  other  words,  CALCG  cannot  accept  multi-termed 
expressions.  Within  a variable  number  of  arguments  is 

an  operand  sub-list  appearing  in  functional  expression 
as  long  as  each  item  of  the  sub-list  (i.e.  the  symbols 
or  constants  separated  by  commas)  does  not  violate  the 
above  rules  concerning  nested  sets  of  parentheses,  and 
as  long  as  the  number  of  items  in  the  sub-list  does  not 
exceed  5. 

Sp>ecial  CALCG  commands  include: 

1.  PRESET:  To  erase  all  symbols  and  their  values  in 

memory  and  start  CALCG  back  at  the  beginning. 

2.  *OPERATIONS:  To  list  the  names  of  arithmatic 

operators . 

3.  ^FUNCTIONS;  To  list  the  functions  provided  by  CALCG. 

4.  *LIST:  To  list  each  symbol  and  its  current  value. 
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5.  *END:  To  terminate  CALCG  and  to  return  control  back 
to  the  Graphics  Control  Program. 

Functions  currently  available  include;  LOG,  SQRT,  TAN, 
ATAN,  SINK,  ERF,  LGAMA,  DI/3GM,  YORMZ , GAMZ2,  CHIZ2,  TTX2, 
BETAZ3,  FFZ3,  LOGIC,  SIN,  ARSIN,  ATANZ , COSH,  ERFC,  MAX, 
YORMS,  GAMX2,  BETAX3,  FFX3,  CAMNC3,  EXP,  COS,ARCOS,  COSTAN, 
TANH,  GAMMA,  MIN,  YORMP,  GAMP2,  CHIP2,  TTP2,  BETAP3,  FFP3, 
and  BETNC5. 

The  statistical  function  names  above  use  the  following 
conventions ; 

(i)  Z indicates  the  ordinate  (probability  density 
function) 


(ii)  X indicates  the  integral  (cumulative  distributive 
function) 

(iii)  P indicates  the  percentage  points  (an  X value 
such  that  the  area  to  X equals  the  input  value) 

(iv)  The  number  at  the  end  of  the  name  is  a reminder 
of  the  number  of  input  arguments  required. 

(v)  YORM=NORMAL,  TT=Student's  t,  FF=F,  CHI=Chi-squre , 

GAM=Gaunma,  BETA=Beta,  GAMNC3=Non-central  Gamma,  BETNC5=Non- 

central  Beta  (2  types,  the  confluent  hypergeometric  has  last 

argument^l,  the  2^5^  hypergeometric  has  last  argument=2,  for 

non-central  distribution  of  r^,  the  squre  of  multiple  corre- 

2 2 

lation.  Thus  BETNC5  (r  ,q,n,p  ,2)  evaluates  the  c.d.f.  of 
2 

r , with  q predictors,  sample  size  N,  and  non-central  value 
2 


CALCO 


CALCO  plots  as  a function  of  x any  expression  currently 
available  in  the  calculator  mode. 

EXAMPLE;  Y=TP2(X,10) 

The  user  should  note  that  PFK  #20  rather  than  PFK  #30 
is  used  to  restart  this  program.  PFK  #0  rather  than  PFK  #31 
terminates  the  program. 

An  expression  may  be  plotted  for  several  values  of  a 
parameter,  say  p. 

EXAMPLE;  Y=TTP2(X,P)  where  P=10  to  110  by  25 

Under  the  multiple  plot  option  the  user  should  remember 
that  the  function  to  be  plotted  must  have  at  least  two 
parameters. 

After  all  information  about  the  function  has  been 
entered  the  user  is  instructed  to  press  PFK  #30  for  a vector 
plot  or  PFK  #31  for  a point  plot.  Under  the  multiple  plot 
option,  no  matter  which  key  he  presses,  one  plot  will  be 
displayed  as  a vector  plot.  In  this  plot  the  variable 
parameter  is  at  its  lowest  value. 

After  the  first  plot  is  displayed,  the  user  may  add 
all  other  graphs  by  pressing  either  PFK  #30  or  PFK  #31. 

If  he  presses  PFK  #10,  exactly  one  plot  is  added.  By  re- 
peatedly pressing  PFK  #10  the  user  may  add  more  plots  to 
the  display. 

All  plots  of  the  function  after  the  first  one  will  be 
point  plots.  Ordinarily  a function  should  not  be  plotted 
for  more  than  7 values  of  a variable  parameter  at  once. 
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If  too  many  points  on  the  screen  are  lighted,  the  display  will  be 
erased  and  remaining  functions  plotted  separately.  To  remedy  this 
situation  the  user  might  try  reducing  the  interval  on  the  Y axis  or  he 
might  increase  the  increment. 

CALCO  can  be  used  to  plot  mixtures  of  distributions  where  one 
distribution  has  a variable  parameter. 

EXAMPLES:  Y=GAMX2(X,6)*1.5+GAMZ2(X.A)*0.4 
where  A is  a variable  parameter 
Since  CALCO,  like  CALCG,  used  polish  stacking  the  expression 
above  is  equivalent  to: 

Y=0.6GAMX2(X,6)+0.4GAMZ2(X,A) 

PLOTF 

PLOTF  plots  as  a function  of  X any  expression  currently  available 
in  the  calculator  mode  (CALCG). 

EXAMPLE:  Y»TP2(X,.10) 

Y=SIN(X)/X 

Y=SIN(X)+COS(X) 

The  option  for  plotting  several  values  of  a parameter  is  not 
operational.  It  is  available  in  another  program  called  BLOWUP. 

The  unit  called  BLOWUP  permits  considerably  greater  flexibility 
than  either  CALCO  or  PLOTF. 

3.3  Adaptation  to  Non-graphics  Equipment 

Documentation:  Appendix  U,  appendix  vol.  V,  pp.  192-285,  and  THEMIS 
report  No.  30  (Hayward  and  Bargmann  [38]) 

The  analysis-of-covariance  unit  (see  Section  2.3)  was  used  as  an 
example  for  adaptation  to  a different  computer  (Control  Data  Cyber) 
with  a standard,  non-graphics  terminal  (Texas  Instruments,  Silent  700). 
This  choice  was  prompted  by  the  following  considerations: 

(a)  Instructions  and  output  appearing  on  the  screen  were  voluminous, 
thus  the  user  had  to  be  given  an  option  to  see  selected  portions 
of  output,  in  random  order; 

(b)  the  program  had  been  designed  for  very  fast  execution,  but  required 
substantial  amounts  of  core  - hence  an  elaborate  overlay  tree  is 
required; 


(c)  considerable  use  was  made  of  program  function  keys,  which  had  to 

be  simulated  on  the  simpler  equipment; 

(d)  the  unit  could  perform  most  of  Its  functions  without  the  use  of 

llghtpen  or  graphics. 

Formally,  the  subroutines  of  the  COMFORT  package  (see  Ref.  [18]) 
had  to  be  converted  from  IBM  - graphics  made  to  Control  Data  teletype 
mode;  most  names  were  simply  retained,  and  their  functions  altered,  so 
that  this  adaptation  will  remain  valid  for  other  units. 

To  permit  selective  display,  at  the  user's  request,  use  was  made  of 
random  access  ("mass  storage")  mode.  Thus,  the  user  merely  types 
the  part  number  of  the  desired  display,  and  may  go  back  and  forth  In 
random  order.  To  a limited  extent  this  procedure  had  been  followed  on 
the  graphics  system. 

Overlay  procedures  differ  considerably  from  system  to  system,  even 
for  computers  from  the  same  manufacturer.  The  earlier  conversion  of  a 
similar  unit  (batch  version  of  MUOAID,  see  Section  2.4)  from  IBM  360/65 
to  IBM  360/20  DOS  served  as  guidance  for  this  problem.  Fortunately, 
the  overlay  technique  on  Control  Data  systems  Is  very  simple. 

To  simulate  the  program  function  keys  which.  In  the  graphics  unit, 
could  be  depressed  In  lieu  of  typed  entry,  the  user  was  directed  to 
prefix  "program  function  key"  number  by  the  symbol  =.  Thus,  If  Instead 
of  replying  to  a request  (by  the  computer)  the  user  wished  to  depress 
program  function  key  30  for  re-start,  he  or  she  would  type  *30. 

Novel  features:  Use  of  random  access  mode  to  permit  display  of 
selected  portions  of  voluminous  output. 

Since  the  Control  Data  Cyber  at  the  University  of  Georgia  has  been 
accessible  statewide,  by  telephone  or  remote  job  entry,  since  1973,  this 
unit  was  used  and  demonstrated  at  other  Installations  (West  Georgia 
College,  Armstrong  College).  This  adaptation  was  also  used  In  a master's 
thesis  In  demography. 
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3.4  Considerations  for  Adaptation  to  other  Graphics  Equipment* 

As  of  this  writing  (December,  1977)  there  appears  to  be  little 
standardization  of  graphics  support  programs.  IBM  announced  the  avail- 
ability of  a graphics  terminal,  3350,  in  the  fall  of  1978,  and  the 
support  programs  of  the  2250  may  be  suitable.  In  the  meantime,  however, 
it  seems  desirable  to  consider,  in  detail,  the  adaptation  of  a portion 
of  our  statistical  conversational  units  to  inexpensive  and  widely 
available  graphics  terminals. 

We  chose  to  compare  the  IBM  2250  Graphics  console  with  the 
Tektronix  4006.  Within  this  section,  the  following  topics  are  covered: 

(a)  The  differences  in  the  features  of  the  Graphics  subroutines  for 
the  IBM  and  Tektronix  terminal,  (b)  the  Tektronix  software,  and  (c) 
the  use  of  the  cursor  on  both  the  IBM  and  Tektronix  models.  With  this 
background,  we  can  modify  the  IBM  subroutines  to  be  compatible  with  the 
Tektronix  4006. 

DIFFERENCES  BETWEEN  THE  IBM  2250  AND  TEKTRONIX  4006. 

To  present  an  accurate  description  of  the  software  and  internal 
logic  of  the  Tektronix  4006,  it  would  be  best  to  examine  the  external 
or  physical  attributes  of  the  system,  after  which  we  can  proceed  to 
the  problem  of  dealing  with  the  conversion  of  the  software. 

The  Tektronix  display  terminal  is  a communications  link  and  display 
device  for  use  with  a wide  range  of  computer  systems.  The  unit  is  com- 
pletely self-contained.  The  display,  the  keyboard,  the  operating  controls, 
and  the  electronics  circuity  to  operate  the  display  and  to  communicate 
with  the  computer  are  conveniently  located  and  contained  within  the  unit. 
The  terminal  is  designed  to  be  directly  or  remotely  connected  to  the 
computer.  Thus,  computer  operations  can  be  directly  influenced  by  the  user 


♦From  Yen,  Ming,  "Transportability  of  interactive  programs,  "unpublished 
MS  thesis,  1977. 


of  the  terminal.  Finally,  there  are  three  basic  modes  of 
operation  which  are  a part  of  the  display:  (1)  Alphanumeric 
(Alpha);  (2)  Graphic;  and  (3)  Hard  copy. 

Let  us  study  how  this  differs  from  the  IBM  2250. 

There  are  three  major  differences  in  the  features  of 
the  IBM  2250  and  the  Tektronix  4006.  The  Tektronix  has  no 
Progreun  Function  Key  (PFK) , Light  Pen,  or  Display  Buffer, 
all  of  which  are  contained  on  the  IBM  2250.  Table  3.4.1 
provides  a quick  comparison  of  the  two  terminals . Even 
though  the  Tektronix  is  devoid  of  the  above  three  features, 
it  does  not  imply  that  they  cannot  be  implemented. 

PROGRAM  FUNCTION  KEY 

The  IBM  2250  has  a Program  Function  Keyboard  (PFK) 
located  to  the  left  of  the  main  keyboard.  The  thirty-two 
keys,  numbered  from  0 to  31,  are  used  to  permit  the  user  to 
make  decisions  in  the  conversational  package  program.  We 
do  not  find  the  PFK's  on  the  Tektronix  4006. 

LIGHT  PEN 

We  can  use  the  Light  Pen  on  the  IBM  2250  to  communicate 
with  the  user  program.  The  operation  of  the  light  pen  is 
described  in  Section  3.1.  There  is  no  light  pen  on  the 
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TABLE  3.4.1 

TERMINAL  COMPARISON-IBM  2250  VS.  TEKTRONIX  4006 


FEATURE 

IBM  2250 

TEKTRONIX  (4006) 

Program  function  key 

YES 

NO 

Light  pen 

YES 

NO 

Graphics  mode 

YES 

YES 

Display  buffer 

YES 

NO 

Cursor 

YES 

YES 

Alphameric  keyboard 

YES 

YES 
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Tektronix  4006. 

HANDLING  OF  THE  LIGHT  PEN  AND  PFK 

There  is  no  convenient  way  to  simulate  the  "light 
pen"  on  the  Tektronix  4006  in  graphics  mode.  The  best 
solution  is  to  move  up  to  the  Tektronix  4012  graphics 

terminal,  which  offers  a "cross-hair  cursor"  for  graphics 
input  mode.  This  feature  is  more  desirable  than  the  light 
pen,  because  any  X,  Y coordinate  can  be  input  to  the  com- 
puter via  the  cross-hair  cursor,  whereas,  with  the  light 
pen,  only  the  X,  Y coordinates  that  are  lighted  can  be 
input  to  the  computer.  A way  to  represent  the  PFK  number 
is  to  type  some  special  character  or  a name  followed  by  a 
number  from  1 to  31  on  the  keyboard.  (In  the  CDC  Cyber 
application  [38]  on  Texas  Instrument  terminals,  PFK's  were 
initiated  by  the  character=;  "press  key  10"  would  be 
assumed  by  "*10"). 

GRAPHICS  SUBROUTINE 

Most  of  the  programs  in  the  GMS  library  depend  upon  the 
GRAF  and  the  COMFORT  graphics  subroutines  (Table!. 4. 2 ). 

The  majority  of  the  GRAF  and  COMFORT  subroutine  would  have 
to  be  modified  for  the  conversion  to  the  Tektronix  4006. 
Table  3.4.2  provides  a brief  description  of  each  graphics 
subroutine  and  how  to  modify  it  to  run  with  the  Tektronix 
software.  We  can  now  begin  to  get  an  overall  concept  of 
the  relationship  that  the  features  have  to  the  internal 
software  support  of  the  Tektronix  systems.  The  importance 
of  the  GRAF  subroutines,  though,  should  not  be  discussed 
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TABLE  3.4.2 

MODIFICATIONS  TO  GMS  SUBROUTINE  REQUIRED  TO 
UTILIZE  THE  TEKTRONIX  4006  TERMINAL 


PACKAGE 

NAME 

SUBROUTINE 

NAME 

PURPOSE  OR  REASON 

EDIT 

SUBROUTINE 

COMFORT 

GBKSP 

To  backspace  the  number  of 
lines  on  the  screen 

Modify 

COMFORT 

GERAS 

To  erase  the  lines  from 
the  screen 

Modify 

COMFORT 

GWAIT 

To  place  the  calling 
program  into  a 'wait' 
state 

Modify 

COMFORT 

GPOST 

To  take  the  calling 
program  out  of  'wait' 
state 

Modify 

COMFORT 

GCPFK 

To  represent  attention, 
interrupts  from  the  IBM 

2250 

Modify 

COMFORT 

GRINIT 

To  initial  the  character 
size 

Dummy 

COMFORT 

GRRLSE 

To  release  all  main 
memory  assign  to  COMAP 

Dummy 

COMFORT 

GRRPLY 

To  transfer  the  message 
type  by  the  user 

Modify 

COMFORT 

GRDPLY 

To  display  message  on 
the  screen 

Modify 

COMFORT 

GCLP 

To  represent  attention  or 
interruption  from  the 

IBM  2250 

Modify 

COMFORT 

GCALM 

To  set  audible  alarm 

Modify 

CRAF 

DISPLA 

To  open  graphics  data 
control  block  and  define 
display  variable 

Dummy 
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TABLE  3.4.2  (continued) 


PACKAGE 

NAME 

SUBROUTINE 

NAME 

PURPOSE  OR  REASON 

EDIT 

SUBROUTINE 

GRAF 

CORDCALL 

Coordinate  unit  (is  fixed 
in  Tektronix) 

Dummy 

GRAF 

PLACE 

To  point  the  light  beam 

Modify 
TP  LOT 

to 

GRAF 

LINE 

To  draw  a straight  line 
with  the  light  beam 

Modify 

TPLOT 

to 

GRAF 

POINT 

To  draw  a point  at  the 
location  indicated  in  the 
call 

Modify 

TPLOT 

to 

GRAF 

CHAR 

To  generate  the  string 
consisting  of  character 
mode 

Modify 

CHIN 

to 

GRAF 

WRFMT$ 

To  write  CHAR  on  the  scope 
under  FORMAT  control 

Modify 

CHOUT 

to 

GRAF 

PLOT 

To  plot  the  image  on  the 
screen 

Modify 

TPLOT 

to 

GRAF 

APEND 

To  apend  more  display 
variables 

Dummy 

GRAF 

RESET 

To  remove  orders  from 

DV  area 

Dummy 

GRAF 

UNPLOT 

To  remove  the  image  from 
the  screen 

Modify 

GRAF 

ERASE 

To  remove  all  the  image 
from  the  screen 

Modify 

GRAF 

BLANK 

To  blank  entire  screen 

Modify 

GRAF 

DETAIN 

To  wait  for  an  attention 

Modify 

'wait' 

state 

to 

GRAF 

DELAY 

To  check  for  an  atten- 
tion 

Modify 

'wait' 

state 

to 
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TABLE  3.4.2  (continued) 


PACKAGE 

NAME 

SUBROUTINE 

NAME 

PURPOSE  OR  REASON 

EDIT 

SUBROUTINE 

GRAF 

DETEKT 

To  check  for  attention 

Modify  to 
'wait ' 
state 

GRAF 

LPNAME 

No  light  pen 

Dummy 

GRAF 

CUR$$ 

To  set  the  position  of 
the  cursor 

Modify  to 
CURSIS 

GRAF 

RCUR$ 

To  remove  the  cursor 

Modify  to 
CURSIS 

GRAF 

SCTDM 

To  read  the  data  that 
has  been  typed  in 
from  alphanumeric 
keyboard 

Modify  to 
CHIN 

GRAF 

SCTDV 

To  update  the  data 
has  been  typed  in 

Modify  to 
CHIN 

GRAF 

DVTDM 

To  place  the  resulting 
data  in  the  FT04F001 
dummy  buffer 

Modify 

GRAF 

BUFRS 

To  reset  FT04F001 
pointers 

Modify 

GRAF 

ALARM 

To  set  audible  alarm 

Modify 

GRAF 

SIZE 

To  compute  the  length 
of  DV 

Modify 

GRAF 

DVDUMP 

No  display  variable 

Dummy 

GRAF 

LIGHTS 

No  program  function 
keys 

Dummy 

GRAF 

PRTCHR 

To  simulate  the  current 
display  on  the  printer 

Modify 

TABLE  3.4.3 

GRAPHICS  SUBROUTINE  IN  EACH  MEMBERS  OF 


UGA.SYSl.GRAPHLIB 


MEMBER  PACKAGE 
NAME  NAME 


GRAPHLICS  SUBROUTINE 


ANACOV  COMFORT  GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GBKSP,  GWAIT. 

BLOWUP  COMFORT  GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GWAIT. 

CALCO  COMFORT  GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GBKSP,  GWAIT,  GPOST. 


ELLIPSE  GRAF 


FORE-  GRAF 
CAST 


MODEL  GRAF 


PATS 


COMFORT 


PLOTF  COMFORT 


LIGHTS,  DETEKT,  PLOT,  DETAIN, 

DISPLA,  CHAR,  CORDCALL,  LINE,  POINT, 

PLACE,  DCORD,  UCORD , UNPLOT,  PLACE$, 

POINT$,  ERASE,  BLANK,  RESET,  CUR$$, 

BUFRS,  SCTDV,  RCUR$. 

SCTDV,  BUFRS,  LIGHTS,  DETKET,  PLOT,  DETAIN, 
RESET,  RCUR$,  ERASE,  CUR$$,  CHAR,  APPEND, 
DISPLA,  UNPLOT,  PLACE$,  LINS$. 

BLANK,  CUR$$,  DISPLA,  LIGHTS,  PLOT, 

RCUR$,  RESET$,  APPEND,  ERASE,  DSCTDV, 
UNPLOT,  ALARM,  CHAR,  PLACE$ . 

GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GWAIT. 

GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GBKSP,  GWAIT,  GPOST. 
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TABLE  3.4.3  (continued) 


NUMBER 

NAME 

PACKAGE 

NAME 

GRAPHICS  SUBROUTINE 

PROD- 

FLOW 

COMFORT 

GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GBKSP,  GWAIT. 

QUANTAL 

GRAF 

DISPLA,  PLOT,  DETAIN,  BUFRS , CUR$$, 

LIGHTS,  SCTDV,  ERASE,  RCUR$ , RESET, 

CHAR,  DETEKT,  UNPLOT,  PLACE$$,  BLANK, 

CORDCALL,  LINES,  SIZE. 

RATIO 

GRAF 

CUR$$,  DETAIN,  DISPLA,  PLOT,  RCUR$ , 

RESET,  UNPLOT,  BUFRS,  DETKET,  CHAR, 

PLACE $ . 

REGRES 

COMFORT 

GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GBKSP,  GWAIT,  GPOST. 

SPLINE 

COMFORT 

GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GBKSP,  GWAIT. 

SPOOK 

COMFORT 

GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GBKSP,  GWAIT,  GPOST. 

WEIBUL 

COMFORT 

GRINIT,  GRRLSE,  GRDPLY,  GRRPLY,  GCPFK, 

GCLP,  GCALM,  GERAS,  GBKSP. 
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without  a more  thorough  examination. 

We  know  that  the  GRAF  subroutines  provide  the  FORTRAN 
programmer  with  the  ability  to  create  and  to  modify  dis- 
plays composed  of  points,  lines,  and  characters,  and  in 
addition,  to  plot  and  to  erase  the  displays  on  the  screen 
of  the  IBM  2250.  The  GRAF  subroutines  also  enable  the  user 
to  engage  the  light  pen  to  select  parts  of  the  display, 
which  thereby  permit  the  user  to  communicate  with  the 
problem  program.  The  use  of  the  PFK's  and  indicator  lights, 
and  the  registering  of  information  from  the  Alphanumeric 
keyboard  are  part  of  GRAF.  But  because  the  Tektronix  (among 
others)  have  no  light  pen  or  PFK's,  we  should  need  to  modify 
the  subroutines  that  are  directed  toward  those  features. 

These  modifications  were  outlined  in  Table  3.4.2.  The 
absence  of  these  features  and  the  modifications  inherent 
on  the  Tektronix  system,  may  raise  questions  to  its  overall 
effectiveness . 

A set  of  software  subroutines  has  been  written  by 
Tektronix  to  facilitate  the  use  of  the  Tektronix  4006  dis- 
play terminal.  The  software  includes  basic  subroutines  for 
each  of  the  following  operating  modes:  (1)  Alphanumeric 
mode,  (2)  Graphic  Display  mode  and  (3)  Graphic  Input  mode. 

All  of  the  routines*  are  written  in  the  FORTRAN  language, 

and  thug  will  run  on  most  computer  systems.  These  sub- 

routines include  a subroutine  neuned  TPLOT,  which  is  part 
of  the  graphics  display  mode  in  the  Tektronix  and  would 
replace  the  PLACE,  LINE^ POINT  and  CHAR  subroutines  in  GRAF. 

* Instruction  Manual,  4006-1  Computer  Display  Users  Manual,  Tektronix,  Inc. 


9* 
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It  should  be  pointed  out  here  that  the  coordinate  point 
system  is  a little  different  on  the  two  graphics  terminals. 
With  the  Tektronix  routine,  we  are  not  able  to  see  if  the 
plotting  on  the  screen  is  out  of  range;  therefore,  some 
precaution  may  need  to  be  taken  as  to  the  size  of  the 
coordinate  system  that  we  are  working  in.  The  Tektronix 
software  does  provide  the  user  with  facilities  of  a cursor 
that  are  compatible  with  IBM's. 

CURSOR 

The  IBM  2250  and  the  Tektronix  4006  employ  a cursor 
which  is  a small  line  beneath  the  character  position  to  be 
entered.  That  is,  when  a key  is  depressed,  the  correspond- 
ing character  appears  in  that  position  on  the  screen  while 
the  cursor  advances  to  the  next  position.  We  use  the  soft- 
ware subroutines  CUR$$  and  RCUR$  (on  the  IBM)  to  handle  this 
cursor  element.  With  a little  modification,  this  can  be 
changed  to  CURSES,  which  is  the  software  subroutine  of  the 
Tektronix  4006  for  the  cursor  element;  now  that  we  have  the 
cursor  subroutines  established,  the  use  should  be  properly 
examined. 

On  the  IBM,  the  user  controls  the  positions  of  the 
cursor  through  the  use  of  four  keys  on  the  console  of  the 
IBM  2250:  (1)  advance,  (2)  backspace,  (3)  continue,  and 
(4)  jump.  The  advance  key,  when  depressed,  moves  the  cursor 
one  character  position  to  the  right.  The  backspace  key 
operates  in  the  same  manner  as  the  advance  key,  the 
exception  being  that  the  direction  Is  reversed.  Depression  of 
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the  continue  key  in  conjunction  with  either  the  advance  or 
the  backspace  keys  will  cause  the  cursor  to  continue  moving 
forward  or  backward,  respectively,  until  they  are  released. 
Finally,  the  jump  key  moves  the  cursor  from  its  present 
position  to  the  first  character  position  in  the  next 
unprotected  block. 

By  contrast,  in  the  Tektronix  terminal,  the  user  con- 
trols the  position  of  the  cursor  through  the  use  of  the 
following  four  keys  on  the  console  of  the  4006:  (1)  page, 

(2)  return,  (3)  line  feed,  and  (4)  cntl  H (=backspace) . 

The  page  key  is  a special  function  key  which  is  used  to 
erase  the  display  when  the  key  is  depressed.  When  this 
occurs,  the  alpha  cursor  moves  to  the  upper  left  corner  of 
the  screen  (known  as  the  "Home  position").  The  return  key 
causes  the  alpha  cursor  to  be  positioned  at  the  left  hand 
starting  point  of  the  present  line.  The  line  feed  key 
causes  the  cursor  to  move  down  one  line  from  the  present 
position. 

INPUT/OUTPUT 

In  the  IBM  2250  we  read  the  2250  buffer  table  (BT) , 
thereby  updating  BT  to  include  any  data  that  has  been  typed 
in  from  the  alphcuneric  keyboard.  These  data  are  stored  in 
the  FT04F001  dummy  buffer  from  which  they  can  now  be  read 
using  a formatted  read  statement.  But  we  cannot  find  the 
graphics  buffer,  buffer  table  and  display  variable  on  the 
Tektronix  system.  We  can  modify  the  GMS  package  read  or 
write  directly  from  the  graphics  terminal. 
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We  do  not  have  to  define  the  display  variable  in  the 
Tektronix  system, so  the  subroutines  DISPLA,  APPEND,  SIZE, 
DVDUMP  of  GRAF  will  become  dummy  subroutines.  The  sub- 
routines SCTDM,  SCTDV,  and  BUFR$  of  the  GRAF  package  reads 
the  data  that  has  been  typed  in  from  the  alphcimeric  keyboard. 
We  can  modify  them  to  "CHOUT”  and  "CURSES"  of  the  sub- 
routines of  Tektronix. 


I 
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CHAPTER  4 

SUPPORTING  ALGORITHMS  AND  SOFTWARE 
4.1  Statistical  Distribution  Package 

Documentation:  Appendix  X,  appendix  vol.  VI,  pp.  210-241,  and  THEMIS 
report  No.  36.  (Bouver  and  Bargmann  [42]) 

Little  programming  effort  is  required  to  provide  the  standard 
distribution  functions  (normal.  Gamma,  Beta,  and  its  derivations)  for 
moderate  precision,  not  too  extreme  tails,  and  integer  or  half-integer 
values  of  parameters.  In  fact,  such  programs  have  been  prepared,  by 
the  use  of  series  or  continued-fraction  algorithms,  on  programmable 
portable  calculators. 

However,  in  the  case  of  real -valued  non-integer  parameters,  extreme 
tails,  or  for  very  large  or  very  small  degrees -of -freedom,  high  preci- 
sion is  quite  difficult  to  attain.  But  such  high  precision  is  required, 
e.g.,  in  the  evaluation  of  distribution  functions  belonging  to  the 
Pearson  class,  or  for  the  efficient  evaluation  of  bivariate  distribu- 
tions; in  the  latter  case  the  truncated  probability  density  function 
is  used  as  a kernel  to  determine  Gaussian  points.  A very  precise  eval- 
uation of  moments  of  such  truncated  distributions  is  required,  and 
depends  upon  distributiop  functions. 

As  mentioned  in  Reference  [42],  we  developed  a universal  statistical 
distribution  package  including  normal  (YORM),  Gamma  (GAM),  Beta  (BET), 
t (TT),  chi-square  (CHI),  and  F (FF).  These  are  FORTRAN  library  func- 
tions; those  ending  in  X evaluate  probabilities  given  abscissas  as 
input;  those  ending  in  P evaluate  abscissas  ("percentage  points"),  given 
probabilities  as  input,  and  those  ending  in  Z evaluate  ordinates,  given 
abscissas.  All  arguments  are  real -valued. 

A study  of  efficiencies  of  various  modules  showed  that,  for  the 
normal  distribution,  for  all  arguments,  and  all  precision,  the  two  con- 
tinued fractions  (Shenton,  for  |x|<2.5,  Laplace  for  |x(>2.5)  are  best. 
Percentage  points  are  obtained  by  Newton-Raphson  iteration.  A precision 
of  11  relative  places  can  be  guaranteed  in  the  Control  Data  version 
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(single  precision,  where  floating  point  arithmetic  is  precise  to  about 
13  decimals).  A double  precision  version,  with  25-place  precision,  is 
also  available. 

In  the  Gamma  distribution,  for  high-precision  evaluation,  the  in- 
finite series  expansion  (non-integer  parameters)  are  useful  only  if 
the  argument  is  small  (less  than  the  mean)  and  the  degrees-of-freedom 
are  small.  The  continued  fraction  algorithm  spans  a considerably 
wider  range  but,  if  the  "degrees-of-freedom"  exceed  100,  none  of  the 
direct  evaluations  are  useful.  Limiting  forms  have  been  studied  (direct 
approach  to  normality,  approach  of  the  cube  root  to  normality  (Wilson- 
Hilferty))  but  they  can  be  used  for  precision  only  if  the  degrees-of- 
freedom  are  in  the  range  of  one  million  or  greater.  For  the  important 
intermediate  range  an  algorithm  was  used  which  expands  the  integral 
into  Hermite  polynomials,  after  expansion  of  the  exponent  around  its 
maximum  value.  This  routine  is  both  fast  and  precise  for  degrees-of- 
freedom  60  or  greater.  Thus,  we  possess  a Gamma  distribution  routine 
which,  on  Control  Data  Cyber,  has  a relative  precision  of  11  decimal 
digits.  Double  precision  versions,  to  23  decimals,  are  also  available 
(though  not  built  in  as  library  functions). 

The  Beta  distribution  has  much  the  same  properties  as  the  Gamma 
distribution.  Continued-fraction  expansions  are  very  good  if  the 
larger  of  the  two  exponents  is  less  than  50;  between  50  and  200  a 
Hermite  expansion  works  best;  beyond  that,  the  approach  to  a Gamma 
distribution  is  very  good.  The  Beta  distribution  routines  built  in  as 
library  functions  in  the  Control  Data  Cyber,  at  the  University  of 
Georgia,  have  a precision  of  at  least  11  significant  digits.  Double 
precision  routines,  to  23  significant  digits,  are  also  available. 

Incidentally,  the  "complete"  Gamma  function  (actually  log  r(x)) 
is  available  as  single  precision  (12  significant  digits  precision)  and 
double  precision  (23  significant  digits  precision)  under  the  name 
DLGGM.  It  uses  the  Euler-McLaurin  sum  formula  and  is  very  efficient. 
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4.2  Non-Central  Distributions 

Documentation:  THEMIS  report  No.  10  (Thomas  [14])  and  Appendix  B, 
appendix  vol.  I,  pp.  31-49. 

The  non-central  Gamma  distribution  (modified  Bessel  function)  and 
two  non-central  Beta  distributions  (^F^  and  2^^^  hypergeometric)  have 
been  included  in  our  statistical  support  package.  The  non-central 
Gamma  distribution  function  evaluates  the  probability  integral  given 
the  abscissa,  shape  parameter  (exponent  plus  one)  and  non-centrality 
parameter.  For  use  as  non-central  chi-square  distribution,  each  parameter 
(and  the  abscissa)  needs  to  be  divided  by  2.  Through  storage  of  terms 
in  the  expansion  of  the  (central)  incomplete  Gamma  integral,  the  evalu- 
ation of  the  non-central  Gamma  distribution  can  be  done  as  quickly  as 
that  of  the  central  one. 

In  the  non-central  Beta  distribution,  two  cases  have  been  considered 

(a)  a convolution  af  Ppisson  terms  and  central  Beta  (non-central  F, 
power  of  the  analysis-of-variance  test,  confluent  hypergeometric),  and 

(b)  a convolution  of  negative  binomial  terms  and  central  Beta  (non- 
central distribution  of  the  square  of  the  multiple  correlation,  2^^  hyper- 
geometric). When  the  modules  were  tested,  evaluation  was  done  by  two 
different  expansions  - the  usual  infinite  series  and  a finite  double 

sum.  As  is  demonstrated  in  Ref.  [14],  the  results  agreed  to  10  signifi- 
cant digits.  For  the  non-central  F distribution,  the  calling  sequence  is 

Y = BNC5  (x,  m/2,  n/2,  y^/2,  1) 

where  x = mF/(mF  + n),  the  usual  conversion  of  F (with  m and  n degrees  of 

freedom)  into  Beta;  y is  the  non-centrality  parameter  associated  with 
2 

the  non-central  x distribution  with  m degrees  of  freedom. 

Because  of  its  restricted  use  for  the  distribution  of  multiple 

correlations,  the  arguments  of  the  non-central  Beta  distribution  of  the 

second  kind  have  a different  interpretation.  The  calling  statement 

P = BNC5  (x,  q,  n,  p^,  2) 

2 2 

returns  in  P the  P (R  < x)  (R  = square  of  multiple  correlation)  given 
q predictors,  sample  size  n (degrees -of -freedom  for  error  plus  1,  if 
sample  is  not  from  a single  population),  and  a population  value  p for 
the  multiple  correlation,  squared. 
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The  non-central  distribution  functions  are  built-in  library 
functions  in  our  Control  Data  Cyber,  and  accessible  to  the  IBM  graphics 
and  batch  systems  (see  its  use  in  Section  2,8). 

4.3  The  Pearsonian  System 

Documentation:  Appendix  M,  appendix  vol . VI,  pp,  145-209,  THEMIS 
reports  No.  28  (Bouver  [36]),  No.  32  (Bouver  and  Bargmann  [39],  and 
No.  36  (Bouver  and  Bargmann  [42]) 

Traditionally,  the  Pearsonian  system  of  distributions  has  been 
used  to  fit  empirical  data  by  a member  of  this  extensive  class  of  dis- 
tributions. However,  as  early  as  1949,  percentage  points  of  the 
Pearsonian  distributions  were  published  in  the  Biometrika  Tables, 
later  improved  in  the  CRC  Handbook  of  Probability  and  Statistics. 

These  tables  represent  excellent  tools  for  approximating  any  distri- 
bution on  the  basis  of  4 moments.  It  is  a natural  extension  of  the 
"central  limit"  normal  distribution  which  is  a two-moment  approximation 
and,  itself,  a member  of  the  Pearson  class.  Beta  and  Gamma  distribu- 


tions are  special  cases,  too. 

Our  concern  was  to  build  library  functions  which  enable  a user  to 
look  up  probabilities  or  percentage  points  of  any  member  of  the  Pear- 
sonian class;  we  also  extended  the  tables  far  beyond  the  values  given 
in  the  Biometrika  Table  42,  and  supplied  six  place  precision. 

With  one  exception  (Type  IV)  the  distributions  are  reducible  to 
the  standard  distributions  described  in  Sections  4,1.  For  the  type  IV 
evaluation,  numerical  quadrature  techniques  were  employed  which  have 
been  developed  and  described  in  THEMIS  Report  No.  26  (Bouver  and  Lether 
[24]). 

This  distribution  package  is  a library  function  in  the  Control  Data 
Cyber  system,  at  the  University  of  Georgia.  The  user  may  call 


P = PEARS  (x,  Bj,  62.  2) 


to  obtain  P[Y  <x]  for  a Pearsonian  distribution  with  skewness  Bj  and 
kurtosis  B2  (^n  the  notation  of  the  Biometrika  Table  42),  where  Y is 
standardized  (mean  0,  variance  1);  or  he  may  call 

Y = PEARS  (P,  Bj.  62.  2) 
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to  obtain  the  percentage  point  corresponding  to  the  probability  P (see 
instructions  to  the  user  on  page  148-149  of  Appendix  volume  VI). 

Studies  are  now  in  progress  to  obtain  bivariate  generalizations 
which  are  well  known  theoretically,  but  quite  difficult  to  implement 
on  computers. 

4.4  Convolution  of  Truncated  Gamma 

Documentation:  Appendix  L,  appendix  vol . Ill,  pp.  216-324,  and  THEMIS 
Report  No.  21  (Bouver  and  Bargmann  [28]). 

The  distribution  of  a sum  of  independent  Gamma  variables  is  itself 
a Gamma  distribution.  However,  when  these  distributions  are  truncated 
(especially  common  in  applications  of  the  gamma  distribution)  these 
sums  have  rather  complicated  distribution  functions. 

A method  of  characteristic  functions  was  applied  to  obtain  exact 
distributions  in  this  instance,  and  tables  and  graphs  were  published. 

In  a dissertation  by  Lavender  (1966),  the  Pearsonian  system  had  been 
used  to  approximate  the  exact  distributions.  We  found  that  these  approx- 
imations were  very  close  indeed,  in  fact,  it  was  the  closeness  of  this 
agreement  which  prompted  us  to  extend  the  library  functions  of  the 
Pearsonian  distributions,  as  described  in  Section  4.3, 

4.5  Generation  of  Pseudo-Random  Numbers 

Documentation:  Appendix  G,  appendix  vol.  II,  pp.  150-245  and  THEMIS 
Report  No.  15  (Cannon  and  Norman  [20]). 

The  purpose  of  this  task  was  the  implementation  of  a technique 
proposed  by  Marsaglia  to  generate,  with  great  efficiency,  pseudo-random 
numbers  under  any  distribution,  with  the  need  for  calling  an  inverse 
distribution  function  program  occurring  only  very  rarely.  Such  a pack- 
age was  programmed  in  IBM  Assembler  language,  and  extensively  studied 
for  speed  of  execution  and  validity  in  simulating  the  prescribed 
distribution. 
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CHAPTER  V 

OTHER  STATISTICAL  TASKS 

Most  of  the  tasks  described  in  this  section  were  either  begun  when 
the  original  THEMIS  project  had  a wider  definition,  or  they  represented 
software  development  which  became  easy  once  the  interactive  packages 
had  been  developed  and  tested. 

5.1  Virtual  Clustering 

Documentation:  THEMIS  Reports  No.  1 and  2 (Bargmann  and  Graney  (1,  2]). 

Virtual  clustering  is  the  appearance  of  concentrations  of  projec- 
tions of  points  in  n-space  on  hyperspherical  surfaces.  The  reason  for 
its  importance  in  cluster  analysis  is  the  possibility  to  sort  out  sample 
points  from  mixed  multivariate  normal  distributions.  The  ordinary 
("real")  clustering  techniques  would  be  of  limited  value,  in  this  con- 
nection, since  samp’e  points  from  multivariate  normal  distributions  are, 
of  course,  clustered  around  the  mean,  but  their  projections  onto  a 
hyperspherical  surface  around  the  mean  would  be  randomly  distributed 
("random  directions").  Departure  from  such  randomness  would  indicate 
the  presence  of  more  than  one  mean,  i.e.,  more  than  one  underlying 
multivariate  normal  population. 

This  study  resulted  in  an  algorithm  and  tables  of  the  distribution 
of  distances  of  k-neighbors.  Two  approaches  are  available  to  the  data- 
analyst:  (a)  a region  may  be  prescribed  (as  a fixed  opening  angle) 
and  the  probability  that  k or  more  points  fall  into  this  region,  given 
random  distribution  of  projections,  may  be  determined;  (b)  the  number 
of  points  occurring  around  a chosen  center  may  be  fixed,  and  the  pro- 
bability determined  that  the  opening  angle  enclosing  these  points  is  0 
or  smaller,  given  that  the  projections  are  randomly  distributed.  Thus, 
if  the  observed  number  of  points  is  larger  than  the  criticel  value  for 
a given  a level,  or  if  the  angle  is  smaller  than  the  critical  angle, 
there  would  be  evidence  of  significant  clustering. 

The  principles  underlying  this  technique  were  applied  in  the  inter- 
active unit  on  cluster  configurations  (see  Section  1.3). 
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5.2  Stochastic  Differential  Equations 

Reference:  THEMIS  Report  No.  6 (Adomian  and  Walker  16]),  No.  7 (Adomian 
[7] ) and  Adomian  [22] . 

The  problems  of  statistical  estimation  in  systems  of  differential 
equations  have  received  considerable  attention  in  "compartmental  analysis”. 
The  usual  iterative  techniques  (commonly  called  "perturbation"  or  sensi- 
tivity analyses)  or  expansions  into  Hermite  polynomials  would  be  successful 
only  if  the  error  component  is  relatively  small. 

Adomian  and  some  of  his  doctoral  students  developed  algorithms  which 
could  provide  solutions  even  if  the  random  component  is  very  large.  In 
the  task  studied  under  this  grant,  the  operator  theory  was  perfected,  and 
some  computer  simulations  were  carried  out,  on  our  IBM  graphics  system. 
Comparison  of  realizations  of  simulated  results  with  results  obtained  by 
the  closure  technique  showed  good  agreement.  The  reason  why  the  simula- 
tion unit  was  removed  from  the  graphics  unit  later  (1974)  is  that  it  was 
quite  costly  in  computer  time,  thus  no  photographs  are  available. 

This  work  is  still  in  progress,  even  some  analytical  solutions  are 
becoming  available.  The  most  serious  drawback  is  still  the  high  cost 
of  computation  even  for  "closed"  solutions,  since  they  require  evaluation 
of  second  derivative  of  eigen-vector  elements  with  respect  to  elements  in 
the  model  matrix  of  a system  of  differential  equations. 

5.3  Structure  and  Distance  of  Logical  Patterns 

Reference:  Appendix  A,  appendix  vol.  I,  pp.  2-30,  and  THEMIS  Report 
No.  3,  and  £(Patel  [3,  11],  see  also  Patel  [9]). 

A logical  pattern  in  this  context  is  a matrix  of  order  n by  p 
(p  possibly  much  larger  than  n)  in  which  the  entries  in  each  column 
denote  states  of  certain  diagnostic  events;  (e.g.,  symptoms  of  a disease). 
"Calibration  patterns"  are  stored  (patterns  of  diagnostic  events  observed 
at  n points  in  time,  when  the  system  is  in  a known  state,  e.g.,  a patient 
with  a known  disease)  and  the  same  attributes  are  observed,  regularly,  to 
determine  whether  a given  pattern  is  close  to  one  of  the  calibration 
patterns.  The  n points  in  time  may  exhibit  dependence  (serial  correlation). 
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Since  the  variance-covariance  matrices  of  such  patterns  are  usually 
singular,  a distance  cannot  be  determined,  directly.  An  assumption  is 
made  that  the  states  of  the  diagnostic  events  are  influenced  by  the 
states  of  just  one  (or  very  few)  major  events,  which  are  artificial. 

For  such  a latent-class  structure,  distances  can  be  calculated. 

A simpler  version  of  this  technique  had  been  used  for  the  evalua- 
tion of  intelligence  systems.  Some  application  to  quality  control  has 
been  discussed  in  Reference  [3]. 

5.4  Tools  of  Analysis  for  Pattern  Recognition 
Documentation:  Appendix  M,  appendix  vol.  IV,  pp.  2-62,  and  THEMIS 
Report  No.  22  (Kundert  and  Bargmann  [29j). 

A computer  program  has  been  developed  which  serves  as  a pre-processor 
of  data,  some  of  which  may  be  ordinal  or  categorized.  Such  data  are 
scaled  so  as  to  maximize  distances  between  criterion  groups  (Fisher- 
Lancaster  approach).  Multivariate  analyses  of  variance  and  factor 
analyses  have  been  performed  on  data  thus  pre-processed.  These  units 
have  had  considerable  application  in  food  science  (to  set  up  scales  for 
subjective  attributes  and  relate  them  to  physico-chemical  measures  or 
gas-chromatograms),  and  in  sociological  research  (effectiveness  of  birth 
control  information).  Examples  including  frequency-of -repair  pattern 
and  real  estate  valuation  are  discussed  in  Reference  [29]. 

5.5  Scaling  of  Multi -dimensional  Categorized  Data 
Reference:  THEMIS  Report  No.  34  (Chang  and  Bargmann  I4D]) 

The  Fisher-Lancaster  approach  to  scaling  categorized  data  is  limited 
to  the  scaling  of  two  nominal  variables  against  each  other  (one  is 
usually  referred  to  as  a criterion  variable).  This  task  attempts  to 
generalize  the  pre-processing  routines  described  in  Section  5.4.  The 
canonical  correlation  was  replaced  by  a generalized  measure  (Steel's 
determinant  of  a correlation  matrix)  so  that  the  Lancaster  method  could 
be  used  in  the  three  and  more  variable  case. 

The  computer  program  handles  up  to  five  nominal  variables.  Valida- 
tion studies  were  performed  to  demonstrate  that  the  scaling  technique 
does,  in  fact,  calculate  correctly  the  expected  values  under  segments 
of  a multivariate  normal  distribution. 
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In  1975  we  discovered  a union-intersection  statistic  (eccentricity 
of  an  ellipse)  which  more  appropriately  generalizes  the  canonical 
correlation  to  three  or  more  sets  of  variables.  Also,  this  statistic 
Is  more  easily  optimized  than  the  determinant  of  a correlation  matrix. 
Thus,  the  pre-processing  program  has  been  changed,  and  the  later  ver- 
sion Is  far  more  efficient. 
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